This is on a Rocky9 Linux system. The nvidia-smi utility works fine. I’m getting these errors starting dcgm-exporter:
dcgm-exporter[156745]: time=“2024-06-05T18:22:12-07:00” level=info msg=“Starting dcgm-exporter”
dcgm-exporter[156745]: time=“2024-06-05T18:22:12-07:00” level=info msg=“DCGM successfully initialized!”
dcgm-exporter[156745]: time=“2024-06-05T18:22:12-07:00” level=info msg=“Not collecting DCP metrics: Error getting supported metrics: API version mismatch”
dcgm-exporter[156745]: time=“2024-06-05T18:22:12-07:00” level=fatal msg=“Error getting device information: API version mismatch”
systemd[1]: gpu_exporter.service: Main process exited, code=exited, status=1/FAILURE
dcgmi --version
dcgmi version: 3.3.6
nv-hostengine --version
Version : 3.3.6
Build ID : 16
Build Date : 2024-05-06
Build Type : Release
Commit ID : 8793bc2208ee01b403711c2be0ad73525a852706
Branch Name : rel_dcgm_3_3
CPU Arch : x86_64
Build Platform : Linux 4.15.0-180-generic #189-Ubuntu SMP Wed May 18 14:13:57 UTC 2022 x86_64
CRC : 30f05a4dc46b8cd955de8fd2c86d2af2
nvidia-smi -L
GPU 0: NVIDIA RTX 6000 Ada Generation (UUID: GPU-7216c5b5-29a7-032d-9e69-be8a00cd897a)
dcgm-exporter --version
DCGM Exporter version Filled by the build system