NVMe Driver not registered with nvidia-fs on Ubuntu 20.04

Hi,
(Sorry reposting as my previous post got flagged as spam)

I am trying to run gds with NVMe on Ubuntu 20.04 but am running into the below.

20-03-2024 23:17:57:472 [pid=3142 tid=3142] ERROR  cufio-fs:199 NVMe Driver not registered with nvidia-fs!!!

gdscheck -p also shows NVMe: Unsupported

Would appreciate any help with this!

This seems similar to NVMe Driver not registered with nvidia-fs - GDS NVMe unsupported on Rocky 8.6 - #26 by kmodukuri but following the instructions there did not seem to resolve my issue. Here is some information about my system

OS version

$ cat /etc/*-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

ext4 nvme mount in journalling mode

$ mount | grep ext4
/dev/sda1 on / type ext4 (rw,relatime)
/dev/nvme0n1 on /mnt type ext4 (rw,relatime,data=ordered)

gdscheck -p

$ /usr/local/cuda/gds/tools/gdscheck -p
 GDS release version: 1.9.0.20
 nvidia_fs version:  2.19 libcufile version: 2.12
 Platform: x86_64
 ============
 ENVIRONMENT:
 ============
 =====================
 DRIVER CONFIGURATION:
 =====================
 NVMe               : Unsupported
 NVMeOF             : Unsupported
 SCSI               : Unsupported
 ScaleFlux CSD      : Unsupported
 NVMesh             : Unsupported
 DDN EXAScaler      : Unsupported
 IBM Spectrum Scale : Unsupported
 NFS                : Unsupported
 BeeGFS             : Unsupported
 WekaFS             : Unsupported
 Userspace RDMA     : Unsupported
 --Mellanox PeerDirect : Disabled
 --rdma library        : Not Loaded (libcufile_rdma.so)
 --rdma devices        : Not configured
 --rdma_device_status  : Up: 0 Down: 0
 =====================
 CUFILE CONFIGURATION:
 =====================
 properties.use_compat_mode : true
 properties.force_compat_mode : false
 properties.gds_rdma_write_support : true
 properties.use_poll_mode : false
 properties.poll_mode_max_size_kb : 4
 properties.max_batch_io_size : 128
 properties.max_batch_io_timeout_msecs : 5
 properties.max_direct_io_size_kb : 16384
 properties.max_device_cache_size_kb : 131072
 properties.max_device_pinned_mem_size_kb : 33554432
 properties.posix_pool_slab_size_kb : 4 1024 16384 
 properties.posix_pool_slab_count : 128 64 32 
 properties.rdma_peer_affinity_policy : RoundRobin
 properties.rdma_dynamic_routing : 0
 fs.generic.posix_unaligned_writes : false
 fs.lustre.posix_gds_min_kb: 0
 fs.beegfs.posix_gds_min_kb: 0
 fs.weka.rdma_write_support: false
 fs.gpfs.gds_write_support: false
 profile.nvtx : false
 profile.cufile_stats : 0
 miscellaneous.api_check_aggressive : false
 execution.max_io_threads : 4
 execution.max_io_queue_depth : 128
 execution.parallel_io : true
 execution.min_io_threshold_size_kb : 8192
 execution.max_request_parallelism : 4
 properties.force_odirect_mode : false
 properties.prefer_iouring : false
 =========
 GPU INFO:
 =========
 GPU index 0 Tesla T4 bar:1 bar size (MiB):256 supports GDS, IOMMU State: Disabled
 ==============
 PLATFORM INFO:
 ==============
 IOMMU: disabled
 Nvidia Driver Info Status: Supported(Nvidia Open Driver Installed)
 Cuda Driver Version Installed:  12040
 Platform: Google Compute Engine, Arch: x86_64(Linux 5.15.0-1054-gcp)
 Platform verification succeeded

nvcc version

$ /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0

ofed_info

$ ofed_info
MLNX_OFED_LINUX-24.01-0.3.3.1 (OFED-24.01-0.3.3):

clusterkit:
mlnx_ofed_clusterkit/clusterkit-1.12.449-1.src.rpm

dpcp:
/sw/release/sw_acceleration/dpcp/dpcp-1.1.46-1.src.rpm

hcoll:
mlnx_ofed_hcol/hcoll-4.8.3227-1.src.rpm

ibarr:
https://github.com/Mellanox/ip2gid master
commit 44ac1948d0d604c723bc36ade0af02c54e7fc7d2
ibdump:
https://github.com/Mellanox/ibdump master
commit d0a4f5aabf21580bee9ba956dfff755b1dd335c3
ibsim:
mlnx_ofed_ibsim/ibsim-0.12.tar.gz

ibutils2:
ibutils2/ibutils2-2.1.1-0.1.MLNX20240128.g605c7811.tar.gz

iser:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_24_01
commit 480e4c34a835edfe0415642160c424b7e9d09fee

isert:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_24_01
commit 480e4c34a835edfe0415642160c424b7e9d09fee

kernel-mft:
mlnx_ofed_mft/kernel-mft-4.27.0-83.src.rpm

knem:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/knem.git mellanox-master
commit 0984cf2a2de70db5c6e6fff375b070eece37c39e
libvma:
vma/source_rpms//libvma-9.8.51-1.src.rpm

libxlio:
/sw/release/sw_acceleration/xlio/libxlio-3.21.2-1.src.rpm

mlnx-dpdk:
https://github.com/Mellanox/dpdk.org mlnx_dpdk_22.11_last_stable
commit 6e315c6a32e2b382665887deb8bd96882a0327ef
mlnx-en:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_24_01
commit 480e4c34a835edfe0415642160c424b7e9d09fee

mlnx-ethtool:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/ethtool.git mlnx_ofed_24_01
commit 1ad54ff7f13f7d081945803e9547e879f825b6a4
mlnx-iproute2:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/iproute2.git mlnx_ofed_24_01
commit c76d3cd57a92e0ffb2183449282cf433a2dd6205
mlnx-nfsrdma:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_24_01
commit 480e4c34a835edfe0415642160c424b7e9d09fee

mlnx-nvme:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_24_01
commit 480e4c34a835edfe0415642160c424b7e9d09fee

mlnx-ofa_kernel:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_24_01
commit 480e4c34a835edfe0415642160c424b7e9d09fee

mlnx-tools:
https://github.com/Mellanox/mlnx-tools mlnx_ofed
commit 92b5e0b5db37dc407238f55158926e8a5a3e5006
mlx-steering-dump:
https://github.com/Mellanox/mlx_steering_dump mlnx_ofed_23_04
commit fc616d9a8f62113b0da6fc5a8948b11177d8461e
mpitests:
mlnx_ofed_mpitest/mpitests-3.2.22-8f11314.src.rpm

mstflint:
mlnx_ofed_mstflint/mstflint-4.16.1-2.tar.gz

multiperf:
https://git-nbu.nvidia.com/r/a/Performance/multiperf rdma-core-support
commit d3fad92dc6984e43cc5377ba0a3126808432ce2d
ofed-docs:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/ofed-docs.git mlnx_ofed-4.0
commit 3d1b0afb7bc190ae5f362223043f76b2b45971cc

openmpi:
mlnx_ofed_ompi_1.8/openmpi-4.1.7a1-1.src.rpm

opensm:
mlnx_ofed_opensm/opensm-5.18.0.MLNX20240128.3f266a48.tar.gz

openvswitch:
https://gitlab-master.nvidia.com/sdn/ovs doca_2_6
commit e92ac078db9c15d836a0d2124ffce06dc39a1c7f
perftest:
mlnx_ofed_perftest/perftest-24.01.0-0.38.gd185c9b.tar.gz

rdma-core:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/rdma-core.git mlnx_ofed_24_01
commit c77bba30e179bbbda8459e8ca3f67b7f05ad0e50
rshim:
mlnx_ofed_soc/rshim-2.0.19-0.gbf7f1f2.src.rpm

sharp:
mlnx_ofed_sharp/sharp-3.6.0.MLNX20240128.e669b4e8.tar.gz

sockperf:
sockperf/sockperf-3.10-0.git5ebd327da983.src.rpm

srp:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/mlnx-ofa_kernel-4.0.git mlnx_ofed_24_01
commit 480e4c34a835edfe0415642160c424b7e9d09fee

ucx:
mlnx_ofed_ucx/ucx-1.16.0-1.src.rpm

xpmem-lib:
/sw/release/mlnx_ofed/IBHPC/MLNX_OFED_LINUX-23.10-0.5.5/SRPMS/xpmem-lib-2.7-0.2310055.src.rpm

xpmem:
https://git-nbu.nvidia.com/r/a/mlnx_ofed/xpmem.git mellanox-master
commit 1e704ce4d2043c5ac45502c934e27fc2e1f07c93

Installed Packages:
-------------------
ii  dpcp                                  1.1.46-1.2401033                         amd64        Direct Packet Control Plane (DPCP) is a library to use Devx
ii  hcoll                                 4.8.3227-1.2401033                       amd64        Hierarchical collectives (HCOLL)
ii  ibacm                                 2307mlnx47-1.2401033                     amd64        InfiniBand Communication Manager Assistant (ACM)
ii  ibarr:amd64                           0.1.3-1.2401033                          amd64        Nvidia address and route userspace resolution services for Infiniband
ii  ibdump                                6.0.0-1.2401033                          amd64        Mellanox packets sniffer tool
ii  ibsim                                 0.12-1.2401033                           amd64        InfiniBand fabric simulator for management
ii  ibsim-doc                             0.12-1.2401033                           all          documentation for ibsim
ii  ibutils2                              2.1.1-0.1.MLNX20240128.g605c7811.2401033 amd64        OpenIB Mellanox InfiniBand Diagnostic Tools
ii  ibverbs-providers:amd64               2307mlnx47-1.2401033                     amd64        User space provider drivers for libibverbs
ii  ibverbs-utils                         2307mlnx47-1.2401033                     amd64        Examples for the libibverbs library
ii  infiniband-diags                      2307mlnx47-1.2401033                     amd64        InfiniBand diagnostic programs
ii  iser-dkms                             24.01.OFED.24.01.0.3.3.1-1               all          DKMS support fo iser kernel modules
ii  isert-dkms                            24.01.OFED.24.01.0.3.3.1-1               all          DKMS support fo isert kernel modules
ii  kernel-mft-dkms                       4.27.0.83-1                              all          DKMS support for kernel-mft kernel modules
ii  knem                                  1.1.4.90mlnx3-OFED.23.10.0.2.1.1         amd64        userspace tools for the KNEM kernel module
ii  knem-dkms                             1.1.4.90mlnx3-OFED.23.10.0.2.1.1         all          DKMS support for mlnx-ofed kernel modules
ii  libibmad-dev:amd64                    2307mlnx47-1.2401033                     amd64        Development files for libibmad
ii  libibmad5:amd64                       2307mlnx47-1.2401033                     amd64        Infiniband Management Datagram (MAD) library
ii  libibnetdisc5:amd64                   2307mlnx47-1.2401033                     amd64        InfiniBand diagnostics library
ii  libibumad-dev:amd64                   2307mlnx47-1.2401033                     amd64        Development files for libibumad
ii  libibumad3:amd64                      2307mlnx47-1.2401033                     amd64        InfiniBand Userspace Management Datagram (uMAD) library
ii  libibverbs-dev:amd64                  2307mlnx47-1.2401033                     amd64        Development files for the libibverbs library
ii  libibverbs1:amd64                     2307mlnx47-1.2401033                     amd64        Library for direct userspace use of RDMA (InfiniBand/iWARP)
ii  libibverbs1-dbg:amd64                 2307mlnx47-1.2401033                     amd64        Debug symbols for the libibverbs library
ii  libopensm                             5.18.0.MLNX20240128.3f266a48-0.1.2401033 amd64        Infiniband subnet manager libraries
ii  libopensm-devel                       5.18.0.MLNX20240128.3f266a48-0.1.2401033 amd64        Development files for OpenSM
ii  librdmacm-dev:amd64                   2307mlnx47-1.2401033                     amd64        Development files for the librdmacm library
ii  librdmacm1:amd64                      2307mlnx47-1.2401033                     amd64        Library for managing RDMA connections
ii  mlnx-ethtool                          6.4-1.2401033                            amd64        This utility allows querying and changing settings such as speed,
ii  mlnx-iproute2                         6.4.0-1.2401033                          amd64        This utility allows querying and changing settings such as speed,
ii  mlnx-nfsrdma-dkms                     24.01.OFED.24.01.0.3.3.1-1               all          DKMS support for NFS RDMA kernel module
ii  mlnx-nvme-dkms                        24.01.OFED.24.01.0.3.3.1-1               all          DKMS support for nvme kernel module
ii  mlnx-ofed-kernel-dkms                 24.01.OFED.24.01.0.3.3.1-1               all          DKMS support for mlnx-ofed kernel modules
ii  mlnx-ofed-kernel-utils                24.01.OFED.24.01.0.3.3.1-1               amd64        Userspace tools to restart and tune mlnx-ofed kernel modules
ii  mlnx-tools                            24.01.0-1.2401033                        amd64        Userspace tools to restart and tune MLNX_OFED kernel modules
ii  mpitests                              3.2.22-8f11314.2401033                   amd64        Set of popular MPI benchmarks and tools IMB 2018 OSU benchmarks ver 4.0.1 mpiP-3.3
ii  mstflint                              4.16.1-2.2401033                         amd64        Mellanox firmware burning application
ii  openmpi                               4.1.7a1-1.2401033                        all          Open MPI
ii  opensm                                5.18.0.MLNX20240128.3f266a48-0.1.2401033 amd64        An Infiniband subnet manager
ii  opensm-doc                            5.18.0.MLNX20240128.3f266a48-0.1.2401033 amd64        Documentation for opensm
ii  perftest                              24.01.0-0.38.gd185c9b.2401033            amd64        Infiniband verbs performance tests
ii  rdma-core                             2307mlnx47-1.2401033                     amd64        RDMA core userspace infrastructure and documentation
ii  rdmacm-utils                          2307mlnx47-1.2401033                     amd64        Examples for the librdmacm library
ii  rshim                                 2.0.19-0.gbf7f1f2.2401033                amd64        driver for Mellanox BlueField SoC
ii  sharp                                 3.6.0.MLNX20240128.e669b4e8-1.2401033    amd64        SHArP switch collectives
ii  srp-dkms                              24.01.OFED.24.01.0.3.3.1-1               all          DKMS support fo srp kernel modules
ii  srptools                              2307mlnx47-1.2401033                     amd64        Tools for Infiniband attached storage (SRP)
ii  ucx                                   1.16.0-1.2401033                         amd64        Unified Communication X
ii  ucx-cuda                              1.16.0-1.2401033                         amd64        Unified Communication X - CUDA support

lspci information

$ lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02)
00:01.0 ISA bridge [0601]: Intel Corporation 82371AB/EB/MB PIIX4 ISA [8086:7110] (rev 03)
00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI [8086:7113] (rev 03)
00:03.0 Non-VGA unclassified device [0000]: Red Hat, Inc. Virtio SCSI [1af4:1004]
00:04.0 Non-Volatile memory controller [0108]: Google, Inc. Device [1ae0:001f] (rev 01)
00:05.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
00:06.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000]
00:07.0 Unclassified device [00ff]: Red Hat, Inc. Virtio RNG [1af4:1005]
$ lspci -tv | egrep -i "nvidia | google"
           +-04.0  Google, Inc. Device 001f
           +-05.0  NVIDIA Corporation TU104GL [Tesla T4]

modinfo and grep for nvme_nvfs

$ modinfo nvme
filename:       /lib/modules/5.15.0-1054-gcp/updates/dkms/nvme.ko
version:        1.0
license:        GPL
file:           drivers/nvme/host/nvme
author:         Matthew Wilcox <willy@linux.intel.com>
parm:           use_threaded_interrupts:int
parm:           use_cmb_sqes:use controller's memory buffer for I/O SQes (bool)
parm:           max_host_mem_size_mb:Maximum Host Memory Buffer (HMB) size per controller (in MiB) (uint)
parm:           sgl_threshold:Use SGLs when average request segment size is larger or equal to this size. Use 0 to disable SGLs. (uint)
parm:           io_queue_depth:set io queue depth, should >= 2 and < 4096
parm:           write_queues:Number of queues to use for writes. If not set, reads and writes will share a queue set.
parm:           poll_queues:Number of queues to use for polled IO.
parm:           noacpi:disable acpi bios quirks (bool)
$ objdump -t /lib/modules/5.15.0-1054-gcp/updates/dkms/nvme.ko | grep nvme_nvfs
00000000000018f0 l     F .text  0000000000000087 nvme_nvfs_unmap_data.isra.0

The platform looks like a VM in GCP. The problem most likely is the NVMe driver is built as part of the kernel and not as a dynamic module and patches cannot be applied.

Also there is a known issue with GCP not supporting NVMe p2p with local disks on VM.

1 Like

Got it, do you know of a solution here (e.g. is rebuilding the kernel an option?)

Further, do you know if AWS might work?

Rebuilding kernel currently will not help with GCP.

With a kernel rebuild Azure and OCI have worked with there local disk solutions. However we have not tested in large scale for production deployments in these environments.

Got it, do you know whether having a bare metal instance from one of these cloud providers might help? Is the nvme module usually loadable on bare metal but builtin for VMs?

Ubuntu and Redhat distros have NVME as a loadable kernel module.