OpenCL auto-switch between CPU/GPU?

I see that the NV OpenCL driver doesn’t support CPU kernels. Though part of the OpenCL spec is that kernels will run in parallel between CPU, GPU, and a mix of the two. It was my understanding that I could write an OpenCL program and it would run on the GPU if one is present, and fall back automatically to the CPU if necessary, and if both are present, it could actually use both of them.

Is this a misunderstanding on my part, or is this possible, when some kind of CPU backend comes out? And who’s going to provide that backend? It seems to be that if support for this is not explicitly provided by the NVIDIA driver, that the mixing of GPU/CPU really isn’t possible.

True Heterogenous computing , unfortunately, is NOT supported by OpenCL 1.0

The spec does not have provisions.

OpenCL folks are now aware of it. You may check the OpenCL forums (topic: True Heterogeneous computing)

Fact is, actual OpenCL 1.0 implementation is not mature and should only considered as a test-bed for future porting of CUDA applications instead of a real-world platform.

There’s too much shortcoming and absent features that you may consider it seriously at this point.
And there are design flaws that may totally impair this EXCELLENT idea to become really mainstream, you’d better focus on CUDA actually and envisage to switch to OpenCL if it don’t fail to gain acceptance.

Anyway, to unleash OpenCL potential, when good implementation will be available, you will need to:

  • code for nVidia GPU (with the same difficulty-level as for CUDA), you will have to know CUDA and underlaying hardware
  • code for ATI GPU (with the same problems as CTM), you will have to know CTM and underlying ATI hardware
    (as ATI documentation and development tools are lagging far behind ATI’s, I doubt it will be easy, not talking about ATI GPU that wasn’t conceived for GPGPU)
  • code for your x86 platform, including SSE to unleash the maximum power of the CPU
  • add a good layer of OpenCL around all that to enable the use of a heterogeneous platform!

My point is you’d better focus on nVidia CUDA technology until OpenCL will be useable for real-world applications with same features and level of performance…

Personally I think OpenCL is on the right track. I think it’s better to leave the “true heterogenenous computing” to the domain-specific application. The computing needs of your chess engine are far different than a restauraunt’s cash register app. One similarity those and games share is that calculations are done quickly to avoid interfering with the user experience.

There are also applications like Folding@Home which can easily use every cycle of CPU time you give it, with no end in sight (there are always more work-units to compute) but it’s designed such that it will give way to any other application with a higher priority, you can even play processor hungry games with Folding@Home running in the background and you won’t notice a difference (or it will be barely noticeable).

It’s nice to think that the user will use your application exclusively while they’re running it (and not have 18 other windows open at the same time) but it’s important to also code the application to continue functioning successfully in a cpu/gpu-time-starved environment.