NVLink not activating on VM

We are using a GPU server with 8 HGX A100s.
We using VM configuration and GPU pass-through in qemu-kvm environment.

However, there is a problem that NVLink is not enabled in the VM, so a workaround is needed. (If it is not a virtualization environment, NVLink operates normally.)

$ nvidia-smi nvlink --status -i 0
GPU 0 A100-SXM4-40GB ~~~~~
Link 0: (inactive)
Link 1: (inactive)
Link 2: (inactive)
Link 3: (inactive)

Our environment is as follows.

Hypervisor kernel : 3.10.0-1160.21.1
qemu-kvm : 1.5.3-141
libvirtd : 3.2.0-14

*VM environment
OS: Ubuntu 18.04 (4.15.0-126)
Framework : TensorFlow 2.4.1
Python: 3.7
CPU: AMD 7402 x 2
CUDA : 11 (Driver 450.102.04)
cuDNN : 8.0.2
TensorRT : 7.2.x

I know it has been a while, did you ever figure this out?

Hi @jhshin1 Could you find a solution to activate NVLink?

I have two H100 NVL and couldn’t activate NVLink on Ubuntu 22.04. Do you have any suggestions?

Thank you

Hi,

H100 NVL is supported with AI Enterprise 5.0:

Please open a support ticket to investigate your issue.

Best regards
Simon