I was wondering if there is a way to measure the bytes transmitted/received on the NVLINK. I can measure the NVLINK utilization as mentioned here but I’m more interested in finding out the raw number of bytes transferred.
Any pointers would be great here.
FYI, I use nsight systems for profiling and the following is my command to profile my binary on a Hopper cluster with 8 GPUs interconnected by NVLINK:
/tmp/nsight-systems-2023.3.1/bin/nsys profile --gpu-metrics-device=all ./<binary>