I have a graphics loop that uses OpenCL on the GPU for processing and OpenGL interop to display the result. When running without any sleep I get around 400-500 FPS. When I do this both my main thread (the one that enqueues all the tasks) and the nvopencl.dll thread (which seems to spend all its time in nvopencl.dll!clGetPlatformInfo) use each about a full CPU core, which is unsurprising.
What is surprising is that when I add a sleep in the main thread to bring the frame rate down from ~500 FPS to ~25 FPS while the main thread usage goes down proportionally, the nvopencl.dll thread keeps taking up 100% of a CPU core! Only below about 23 FPS does the usage for this thread start to go down proportionally with any increases in sleeping.
I can’t make sense of it, how can doing 1/20th of the job make that thread keep on running at 100%? Any busy-waits in there? Here’s the relevant code if that helps:
while(1)
{
// enqueues the OpenCL kernel stuff
glClearTexImage(fb->gltex, 0, GL_RGBA, GL_UNSIGNED_BYTE, &z); // clears GL/CL texture
glFinish();
clEnqueueAcquireGLObjects(command_queue, 1, &fb->cl_srgb, 0, 0, NULL); // get the ownership of cl_srgb
clEnqueueNDRangeKernel(command_queue, ...);
clFlush(command_queue);
... // Host-side kernel data-preparing stuff goes on here
clFinish(command_queue);
clEnqueueReleaseGLObjects(command_queue, 1, &fb->cl_srgb, 0, 0, NULL); // release the ownership (back to GL)
// display the texture filled by OpenCL to the screen
glBegin(GL_QUADS);
glTexCoord2f(0.f, 0.f); glVertex2f(0.f, 0.f);
glTexCoord2f(1.f, 0.f); glVertex2f(fb->w, 0.f);
glTexCoord2f(1.f, 1.f); glVertex2f(fb->w, fb->h);
glTexCoord2f(0.f, 1.f); glVertex2f(0.f, fb->h);
glEnd();
SDL_Delay(40); // this brings down the main thread's usage but not nvopencl.dll's
SDL_GL_SwapWindow(window);
glFinish();
}