I’m using the Nsight systems profiling tool, and found a mismatch between the profile result and nvidia-smi result.
When I try to move a DNN model from host memory to the device using model.cuda() call, I see that the GPU utilization is 0% in Nsight systems like this:
However, when I check nvidia-smi result during this, I see that actually GPU utilization is not 0% like this:
Could you please help to clarify this? Is it my misunderstanding? Thanks!