Error at clEnqueueReadBuffer

Hi,

I receive an error after calling clEnqueueReadBuffer(). The error code isn’t defined in the opencl specification, so I don’t know what it is adn how to handle it. By reducing the number of instructions the error doesn’t appear. I do a lot of rounds, ceils and floors in my kernel. If I copy one of those instructions a few times, the error occurs again. Any suggestions where this error comes from? Maybe some kinda instruction limit, but that wouldn’t make much sense, since in CUDA there is no limitation.

Thanks in advance.
Daniel

You can try to replace round and floor function by a cast and you will see if it’s a limitation
Thanks
J

Ah, ok… I fixed it: I’m now using rint() instead of round() because round() uses 8 instructions whereas rint() only takes 1 instruction. But why can’t I use more than 3 or 4 round()-calls in one kernel? How do you implement more complex functions in a opencl kernel without exceeding the instruction limit of opencl?