NVIDIA A40 not shows mdev_supported_types and I can't create vGPUS instances

Thanks Stefan,

your solution has worked!!

After doing “sriov-manage -e” appeared new directories in /sys/bus/pci/devices/$bus

root@a40# bus=$(nvidia-smi -q |grep ^GPU |awk -F " 0000" '{print tolower($2)}')
root@a40# /usr/lib/nvidia/sriov-manage -e $bus
root@a40 # ls /sys/bus/pci/devices/$bus/| grep ^virtfn |wc -l
32

These directories have a different aproach than NVDIa says in his documentation. There is not “mdev_supported_types” directory, but it appears 32 directories “virtfn0” to “virtfn31”.

In each virtfn* directory apperas a mdev_supported_types that contains all models of vGPU availables in this card.

For example:

root@a40# cat "/sys/bus/pci/devices/0000:41:00.0/virtfn0/mdev_supported_types/nvidia-557/name"
NVIDIA A40-1Q
root@a40# cat "/sys/bus/pci/devices/0000:41:00.0/virtfn0/mdev_supported_types/nvidia-557/available_instances" 
1

If you create a mdev device instance:

root@a40# uid=$(uuidgen)
root@a40# echo $uid > "/sys/bus/pci/devices/0000:41:00.0/virtfn0/mdev_supported_types/nvidia-557/create"
root@a40# cat "/sys/bus/pci/devices/0000:41:00.0/virtfn0/mdev_supported_types/nvidia-557/available_instances"
0

And if you create more instances, for example:

root@a40# uid=$(uuidgen)
root@a40# echo $uid > "/sys/bus/pci/devices/0000:41:00.0/virtfn1/mdev_supported_types/nvidia-557/create"

If we have crated 3 instances:

root@a40# ls /sys/bus/mdev/devices/ |wc -l 
3

And for a maximum of 32 instances of type NVIDIA A40-1Q we have 29 instances availables over all the directories:

root@a40# cat /sys/bus/pci/devices/0000\:41\:00.0/virtfn*/mdev_supported_types/nvidia-557/available_instances |grep 1 |wc -l 
29

I will that this strange behaviour could be explained by NVIDIA or solved in next releases of vGPU software.

1 Like