Does OpenCL's user event have any alternatives in CUDA?

So I can summit some work that will wait for the work on the host side done?

I found that there seemed to be no way to signal a CUDA event from host side. If it is so, I will have to summit the work on the GPU side after the work on host side it depends done. And I will end up have to summit everything depends on that GPU work after that.

This is creepy. Did I miss something?

Do any of these help?

cudaEventCreate
cudaEventCreateWithFlags
cudaEventDestroy
cudaEventElapsedTime
cudaEventQuery
cudaEventRecord
cudaEventSynchronize

http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__EVENT.html#axzz4g8DoguGe

No :( cudaEventRecord can only signal from GPU side.

You can just run the host code and launch a CUDA kernel after that which I think does what you want.

Actually what I want is to keep my old architecture if possible. And it seems that it would much harder to pipeline the entire job this way, and make the logic much more complex…

I think what you want is cuStreamWaitValue32 and cuStreamWriteValue32, the API that was developed for GPU Direct Async.