I use nvprof
to get some infomation like startTime
, Duration
, KernelName
during training a model, the command:
nvprof --csv --log-file log.csv --print-gpu-trace python test1.py
the result is following:
and I find there is a idle time between some transfers, for example:
8.238898s + 0.264026s = 8.502924s < 8.534692s
and there is also a idle time between some kernels, for example:
8.602683s + 0.0217733s = 8.6244563s < 8.624457s
usually, the idle time between transfers is longer than the kernels, and I want to know what happened during this idle time? In addition, there is a Name called [CUDA memcpy DtoD]
, what is the process doing?
Any response would be greatly appreciated!