CUDA

CUDA is a parallel computing platform and programming model developed by NVIDIA for general computing on GPUs.

GPU Programming Paradigm

GPU Programming is usually a 3 step process:

  1. Transfer Data to GPU (device)
  2. Perform computation on GPU
  3. Transfer Data to CPU (host)

This is why you want to overlap memory transfer and compute whenever possible. You can do this with Prefething.