I am wondering about which API can monitor streams.
For example, using asynchronous memcpy (i.e., cudaMemcpyAsync).
and CPU thread transfers data and then executing the kernel.
In this problem is CPU thread waits for when the kernel is finished.
I want to know the callback API for monitoring.
which API can callback when the kernel is finished?
I found this slide. Is 10th-page explanation right?
I confuse the slide… monitoring is right. but I think that monitors just about events.
I need the monitoring API for stream jobs.
Thank you for reading and helping me.
Have an awesome day guys.
Sounds like you’re looking for cudaStreamSynchronize.
That is not discussed in that presentation, because stream callbacks did not exist in the 2009 timeframe.
Thank you so much Robert_Crovella!
I have a question.
Is there the visual profiler for Linux terminal?
I want to see the overlapping stream works… but I can’t see the overlapping.
The legacy profilers are nvvp and nvprof. nvvp is a visual profiler
The new profilers are nsight compute and nsight systems. Both are installed with the latest version of CUDA (10.1) on linux. To use the visual profilers on linux you generally need to be running X somewhere.