Currently I’m running some Docker containers in multiples Jetson Nano. To monitor all of them I have a small container in each of them, where a cron run some commands to get system stats and send them to a web page. One of the commands I use to monitor system resources is Tegrastats. This small container runs smoothly most of the time since is only a mix of small system commands, grep, sed, etc. But at some point the Tegrastats command stop giving information about GPU frequency ( EMC_FREQ and GR3D_FREQ info in Tegrastats’ output), making the command pipeline to fail after that.
Just as reference, normally Tegrastats’ output something like:
When the Docker container starts, Tegrastats inside the container works normally.
Tegrastats can keep working normally inside the container for days.
While the container fails to get the normal Tegrastats output. Tegrastats runs normally if its ran inside the host. (Running Tegrastats manually inside the Docker container gives the output missing both freqs)
The command used is “tegrastats” not “sudo tegrastats”, since it gives the information I need anyway.
The Docker container runs with “runtime nvidia”
Is there something I could be missing that makes Tegrastats stop getting the GPU usage? anything I could check for information on errors? any service that could need a restart?
So, I was having the same issues. I wanted to get the GPU frequency inside of a docker container and stumbled upon (what I believe to be) the root-cause:
/usr/bin/jetson_clocks not having access to clocks
# /usr/bin/jetson_clocks
cat: /sys/kernel/debug/bpmp/debug/clk/emc/max_rate: No such file or directory
/usr/bin/jetson_clocks: line 328: /sys/kernel/debug/bpmp/debug/clk/emc/mrq_rate_locked: No such file or directory
/usr/bin/jetson_clocks: line 329: /sys/kernel/debug/bpmp/debug/clk/emc/rate: No such file or directory
I am still working through this but wanted to inquire if @rlienlafsoto was able to solve the original issue.