Hello,
I have a ResNet50 TensorRT model that I converted from a ResNet50 PyTorch model using the trtexec tool. Currently, I execute inference and log the timing information for each query into a JSON file using the following command:
$TRTEXEC --useRuntime=full --useSpinWait --duration=10 --dumpProfile --exportProfile=profile.json --verbose --exportTimes=time.log --separateProfileRun --shapes=images:1x3x224x224 --infStreams=1 --loadEngine=int8.engine
Upon examining the time.json file (I renamed the file extension from .json to .log to facilitate uploading), I observed that each entry contains startH2dMs, endH2dMs, startD2h and endD2h timestamps and their corresponding differences. time.log (1.4 MB)
However, to my understanding, the Jetson Orin Nano features a unified physical memory architecture between the GPU and CPU, eliminating the need for explicit communication between the CPU and GPU. Consequently, I would expect that the startH2d, endH2d, startD2hMs, and endD2hMs values would either be absent or their corresponding differences would be zero. This expectation is not reflected in the data.
Could you please provide an explanation for this discrepancy?
Note: I am using Jetson Orin Nano Dev Board