printf in OpenCL?


I would like to know if OpenCL support printf? (CUDA uses cuPrintf) If not, how do you debug OpenCL kernel?


You can check out appendix B of the Khronos OpenCl spec.
You will notice it points you towards OpenCl on CPU’s, allowing you to use printf, or extensions.
To me, the idea of printf in a “massively parallel” platform seems, well, unlikely to produce the information you need to repair any but fairly elementary mistakes.
Debugging has been announced for the Nexus program (integration with VC9 workflow). There are some video’s to show how it is done, but most people are still waiting to get their hands on it. Very promising!
In the meantime, you can define one or more arrays passed to your kernel for output (writing) to store intermediate values, but as a rule, you can’t tell which thread is responsible for the traced value, unless you put work into that as well.
What I found useful was to reprogram a kernel for (serial) CPU, bending over backwards to realize the consequences of doing stuff in parallel. It helped me in two ways, one, to prove the essential correctness of what I was trying to do, and perhaps more importantly, it forced me to think in parallel, even although I am really most serially inclined (only good things, I hope).
Lastly I offer you a phrase of another user of this forum, as I remember it:
debugging sucks, testing rocks.
Good luck,

Hi there,
did you resolve this issue? The Apple OpenCL programming guide states that printf can be used but they do not explicitly state that it is CPU only.

Debugging statements are useful. Certainly for getting up to speed with OpenCL using basic examples.

I believe most experienced developers are capable of judging for themselves what approach works best at any given moment.

Kind regards,

Two methods that I’ve found that work for me:

  1. If you have a CPU-based implementation of the code, you can run both and compare results… It’s not fun, but it does the trick.

  2. AMD’s Stream CPU implementation supports the cl_amd_printf extension, which lets you use printf in your kernels.

And another that I haven’t done myself:

  1. If you’re running in Windows with Visual Studio, check out Nvidia’s Parallel Insight software for VS-based debugging of GPU code.

As a previous poster mentioned, printf’s might be tricky. They will probably get printed out in random order as the threads hit the printf statements, so you will probably want to include the thread Id in your print statements and do some sorting of the output to make it easier to read.