event synchronization on multiple device context

i am working on window 7 and driver 197.45.

i notice that when trying to synchronize two command queues that belong to different devices the synchronization is cpu blocking even if the flag ‘blocking_write/read’ is explicitly set to ‘CL_FALSE’.

i created a multiple device context and called : ‘clEnqueueWriteBuffer’ with ‘blocking_write’ parameter set to CL_FALSE. the command was also dependent on an event that is generated by a call to ‘clEnqueueReadBuffer’ with a command queue which is associated with a different gpu.

This is a major performance bottleneck which took me a week to figure out!!!
I here by call to the openCL team to fix it at once !!

Tzachi