hello,
[probably not the correct category but the “other tools” seems precluded to me]
we have a slurm cluster with a previous node with A100-80G cards managed via GitHub - NVIDIA/mig-parted: MIG Partition Editor for NVIDIA GPUs and integrated in slurm via nvidia / hpc / slurm-mig-discovery · GitLab
same process however does not work for a new server with RTX PRO 6000 Blackwell Server Edition cards, mig partitions are correctly enumerated by nvidia-smi but the discovery tool errors out with
GPU count 4
Error in nvmlDeviceGetName()
is there a new version of the tool available? are we missing something?
regards