Hi,
I did some testing and with the DAC plugged in, temperatures are roughly 6 degrees celsius higher and obviously power usage is more. This is even if I disable the Mellanox.
meanaverage@dgx2:~$ for dev in 0000:01:00.0 0000:01:00.1 0002:01:00.0 0002:01:00.1; do
echo “$dev” | sudo tee /sys/bus/pci/drivers/mlx5_core/unbind
done
0000:01:00.0
0000:01:00.1
0002:01:00.0
0002:01:00.1
meanaverage@dgx2:~$ for dev in 0000:01:00.0 0000:01:00.1 0002:01:00.0 0002:01:00.1; do
echo 1 | sudo tee /sys/bus/pci/devices/$dev/remove
done
1
1
1
1
Is there a command that can truly override CX7 power control? Nothing but actually ejecting the DAC brings the temperatures way down.
(immediately upon DAC pull)Often I’m processing jobs on multiple DGX’s and I don’t always need 200GbE between them for parallel workflows, and have redundant paths to the NAS and each other via 10GbE… so I’d love a way to just nip off the CX7 when it’s not needed during these long, hot runs.
Thanks

