After creating a stream, we can launch operations like cudaMemcpyAsync/kernels to it. I want to know how many operations are there in the stream at any time. Does there exist any cudart api or driver api to do it? If CDP is used, does similar api exist? Thanks!
nothing is provided by CUDA to do that.
You could use events and query the events to find out where stream processing has reached.
You can of course also use cudaStreamQuery()
. That’s a binary finished/not finished indication.
Thanks! I’ll try that.
Also cudaStreamAddCallback
with a callback just increasing a counter (atomic or not?) could be a solution.
2 Likes