Device Emu on Quad Core CPU How can I use all 4 cores?

When I use DebugEmu or ReleaseEmu, I can see the number of threads spike in Windows Task Manager to the amount that I have per block in my program, however it appears I only utilize one of the four cores on my CPU because total CPU usage only goes up to 25%. Is there a way to get DeviceEmu mode to run on all the cores my CPU has to offer?

Thanks for any help.

Well, why you need that? DebugEmu is to make debugging easy. Debugging multithreaded things is not that easy.

As BarsMonster says: the emulation modes are designed for running kernels in the debugger. The performance is really very slow.

If you want a performance oriented CPU execution of a CUDA kernel, read up on MCUDA or wait for NVIDIA’s promised release of a CUDA with this as a compile time option (no timeframe was mentioned for release, CUDA 2.1 if we are lucky but maybe not till 3.0).

HOSTEMU does not currently have SMP support.