When I run the OpenCL Nbody demo, it fully utilize one core of my Core 2 Quad Machine. Why does a GPU application takes up CPU resources ?
This is because Nbody uses alternating memory buffers that are being updated with blocking enabled. It constantly copies data to and from device to integrate and draw.
Blocking copy causes the issuing thread to wait and it’s a spinlock wait, not a sleep wait.