Please note that a Visual Profiler Patch for the CUDA 4.0 release has been posted for Linux. This patch is specifically to address the issue for profiling applications using multiple streams for the case when Visual Profiler reports an error: “In this profiling session some profiler output rows are dropped due to incorrect gpu time stamp values and the profiler output is incomplete.”
It is available on NVIDIA Developer Zone : CUDA Toolkit 4.0 . Look under the Linux downloads section on the page (search for “Visual Profiler Patch”).
If you have CUDA Toolkit version 4.0.17 to install the patch:
- Rename the existing Visual Profiler executable
($TOOLKIT_DIR points to the directory under which the CUDA Toolkit version 4.0.17 is installed)
mv computeprof computeprof.4.0.17
- Install the new Visual Profiler executable from the patch
tar xvf visualprofiler_4.0.51_linux*.tar.gz