I am trying to profile a cuda program composed by two cuda streams, the first stream implements a compute-intensive task ( a simple vector addition ) and the second one is a stream that performs read and write from global memory.
The intent of the second stream is to create memory interference and to disturb the compute kernel.
I wanted to understand what really happens in GPU memory ( in global memory and L2 cache ), I tried with:
$ nvprof -o out.nvprof ./add
and I opened the output file with $ nvvp out.nvprof. The problem is that when I open the “Memory Statistics” tab, nvvp shows me the following message:
“Insufficient Memory Statistics Data, The data needed to calculate Memory Statistics is not available for selected kernel”