Does Emulation Mode Actually Use Threads? Does anyone know?

When executing a CUDA program in emulation mode, it doesn’t use the GPU.
But does it still launch threads which are executed on the CPU or is it entirely sequential?

Anyone know?

To the best of my knowledge, emulation mode is

  1. Threaded
  2. Really discourged in favor of the debugger
  3. Still useful from time to time.

Both. The pthreads library is used to launch one host thread per gpu thread, but threads are run sequentially to completion.