I was wondering if concurrent kernels within OpenCL works on the GTX480.
I create multiple command queues and put kernels on different queues.
Is there an example case ? I made my own example in a similar fashion to the “concurrent kernel” example of the CUDA SDK.
I read out the values of the OpenCL events and there is no overlap between start and end times of events.
I don’t insert any synchronization in between enqueuNDRange calls.
I even tried with seperate buffers used by each kernel but still no luck
Should the OpenCL events interface show the overlap ?
I saw an older post but there doesn’t seem to be any consensus there.