General Question

I am a programmer exploring both OpenCL and CUDA. First I was curious to know whether the code is actually running on the GPU and reading and writing to the host? Or does the code run on the host reading and writing to the GPU? I would take a guess and say it is the first mentioned.

Running on the GPU and reading/writing to the GPU. Usually. Sometimes you can use mapped memory to have the GPU use host’s memory at a performance cost.