I have built a CNN using tensorflow with GPU support, and ran it on Jetson nano. I want to get some metrics such as CPU load, GPU load, RAM on a .csv file, and most important I want to measure the part of the code, more specifically the time duration that tensorflow accelerates my code on the validation stage. How can I achieve all these?
A simplest way is to profile your application with built-in profiler.
$ sudo /usr/local/cuda-10.2/bin/nvprof [python3 test.py]