Monitoring PX2 (telemetry)

Wonder if there is existing solution to add telemetry to px2? I want to install DCGM (https://github.com/NVIDIA/gpu-monitoring-tools/tree/master/exporters/prometheus-dcgm) or NVML based prometheus exporter (https://github.com/mindprince/nvidia_gpu_prometheus_exporter) on px2 but not sure if it is doable?

Thanks

Dear @rren,
I do not see aarch64 in NVML supported supported platforms at https://docs.nvidia.com/deploy/nvml-api/nvml-api-reference.html#nvml-api-reference

Thanks!
Do you know how I can check what GPU was specifically used in my PX2 (GP106 etc.)? I saw GP100 was supported on the list following the link you provided.

Or does “GP100” mean the whole “GP10X” series? which would mean all Pascal GPUs are supported?

I found that I could use “devicequery” to get the GPU models (How to Monitor Discrete GPU Workload/Usage on PX2?)

I am also trying to find the units of temperature metrics from “tegrastats”. When I run “tegrastats GPU-therm”, it gives “GPU-therm 55000”. Is the unit mC here? (i.e. the temperature should be interpreted as 55 celcius?)

Dear @rren,
GP100 does not mean GP 10x series. I don’t find “tegrastats GPU-therm” command working on my machine. Did you check the tegrastats documentation at https://docs.nvidia.com/drive/active/5.0.10.3L/nvvib_docs/index.html#page/NVIDIA%20DRIVE%20Linux%20SDK%20Development%20Guide%2FUtilities%2FAppendixTegraStats.html%23wwpID0EMHA