Introductory Questions Starting with CUDA


I’d like to ask some basic questions if I may:

  1. Can one dynamically load compiled code programmatically into the GPU board?
  2. Is there a Windows API (DLL?) that one uses to init, load, unload, invoke erc, CUDA code?
  3. Can a CUDA board itself, programmatically, access memory in Vista?
  4. Can code on windows access memory on the CUDA board easily?
  5. Must a process have Admin rights in order to load code into a CUDA board?
  6. Must a process have Admin rights in order to invoke code on said board?



  1. yes
  2. yes
  3. not directly, have to copy it through the CUDA API to the board
  4. yes (copy from device to host)
  5. no, but he must be the desktop user (at the moment, this may change in 2.1)
  6. same as 5

Many thanks for those answers. I do have a few more, sorry to be vague:

  1. So if the host has say 50 pages of data in memory, the host can copy a block of that data to the adapter and then ask the adapter to do something, to run some code?

  2. If so does the adapter notify some system event in Windows when it completes, I mean is this synchronous or asynchronously done?

  3. Is global memory the same as host memory, or have I got the wrong impression from the brief info I have read?


  1. Yes.
  2. There’s no way for the GPU to trigger an interrupt and wake a thread when it’s done. Right now, this is done by polling the GPU (either spinlocking or yielding the thread after a failed poll). All kernel launches are asynchronous, though, so you can effectively queue up a number of kernels, do other processing on the CPU, and then call cudaThreadSynchronize() which does the polling when you’ve done all the work you can.
  3. Global memory is the GDDR3 connected directly to the GPU. Host memory is the standard DDR2/DDR3 that your CPU uses.