clEnqueueReadBuffer and a still running kernel

I’m wondering if I try to call clEnqueueReadBuffer with nonblocking on a kernel that is still running, will the read buffer command not go until the kernel is finished? I hope this is the case, but the OpenCL spec doesn’t really state this explicitly. I ask because I’m getting weird results from my kernel and I almost wonder whether it’s due to an incomplete copy.

It’s not guaranteed to wait for kernel completion and likely won’t. You need to explicitly synchronize or pass an event object around.

It this synchronization required even with in-order queues?

Hm, the specs say

Which means with in-order queues you shouldn’t need to manually sync.

Looking into the sources from SDK, they don’t manually sync clEnqueueReadBuffer after a kernel launch so with in-order execution there should be no need to.

Z98, are you using a single, in-order command queue?

I believe so, since I never set that flag, nor was I aware of its existence. I’ve only made it through about maybe 15% of the entire spec in detail and my searches through the PDF never got me to anything like that. Though fortunately me and the rest of my team have come up with a different theory for why the results are borked.