Analyze the gpu processing for during training

Hi

I would like to know the way to check the system processing utilization such as copying memory host to gpu or getting datasets via nfs server and so on for during training process with using NGC container image. Do you have any way to do it?
Especially, I would like to compare the performance by changing the netowork speed like 1g, 10g, 100g.

Best regards.
Kaka