kernel fails with non-immediate loop count

Hi!

The attached Mandelbrot program reports an error (-5 = CL_OUT_OF_RESOURCES) if the maximum iteration counter “params.imax” is set to a value of 20 or larger. On the other hand, with an immediate value (e.g., 1000) as termination criterion, the kernel works perfectly (set “#define WORKAROUND 1” to demonstrate this). This looks to me like a bug in the OpenCL implementation, or am I missing something? Is this somehow related to loop unrolling, and can this be controlled in OpenCL (such as “#pragma unroll” in CUDA)?

Thanks & kind regards,
Markus
mandelbrot.zip (1.38 KB)

I would assert it’s a problem with the nvidia OpenCL implementation. This app compiles and runs fine for me. (though the image is all black)