Profile memory activity on Jetson TX2


I am trying to profile a cuda program composed by two cuda streams, the first stream implements a compute-intensive task ( a simple vector addition ) and the second one is a stream that performs read and write from global memory.

The intent of the second stream is to create memory interference and to disturb the compute kernel.

I wanted to understand what really happens in GPU memory ( in global memory and L2 cache ), I tried with:

$ nvprof -o out.nvprof ./add

and I opened the output file with $ nvvp out.nvprof. The problem is that when I open the “Memory Statistics” tab, nvvp shows me the following message:

“Insufficient Memory Statistics Data, The data needed to calculate Memory Statistics is not available for selected kernel”



L2 cache profiling is not available for TX2.
But you should be able to get some general memory usage with nvprof.

You may need some extra configuration to enable the feature.
Please check this document for the corresponding flag: