I am wondering about which API can monitor streams.
For example, using asynchronous memcpy (i.e., cudaMemcpyAsync).
and CPU thread transfers data and then executing the kernel.
In this problem is CPU thread waits for when the kernel is finished.
I want to know the callback API for monitoring.
which API can callback when the kernel is finished?
I found this slide. Is 10th-page explanation right?
I confuse the slide… monitoring is right. but I think that monitors just about events.
I need the monitoring API for stream jobs.
Thank you for reading and helping me.
Have an awesome day guys.
The legacy profilers are nvvp and nvprof. nvvp is a visual profiler
The new profilers are nsight compute and nsight systems. Both are installed with the latest version of CUDA (10.1) on linux. To use the visual profilers on linux you generally need to be running X somewhere.