Hi,
I want to use nvidia-fs. Run sudo bash cuda_12.4.1_550.54.15_linux.run, installed failed.
This is cuda-installer.log:
[INFO]: Driver installation detected by command: apt list --installed | grep -e nvidia-driver-[0-9][0-9][0-9] -e nvidia-[0-9][0-9][0-9]
[INFO]: Cleaning up window
[INFO]: Complete
[INFO]: Checking compiler version...
[INFO]: gcc location: /usr/bin/gcc
[INFO]: gcc version: gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.3)
[INFO]: Initializing menu
[INFO]: nvidia-fs.setKOVersion(2.19.7)
[INFO]: Setup complete
[INFO]: Installing: Kernel Objects
[INFO]: cudaItem::install - kernelobjects
[INFO]: Able to write to /usr/local/kernelobjects/
[INFO]: Installing: nvidia-fs
[INFO]: cudaItem::install - kernelobjects
[INFO]: dkms found
[INFO]: DKMS is installed,proceed with dkms install
[INFO]: previous version of nvidia-fs is not installed, nvidia-fs version: 2.19.7 will be installed.
[INFO]: getting mofed Status
[INFO]: installation status shows that mofed is not installed,please install mofed before continuing nvidia_fs install.
[ERROR]: Install of nvidia-fs failed, quitting
my env: ubuntu20, NV Driver Version: 550.135, NVIDIA GeForce MX150 , gcc/g++9.4.0, Cuda compilation tools, release 12.4, V12.4.131
nvcc can compile fine. nvidia-smi works fine.
I didn’t find any information about mofed installation, please help me, the failure to install nvidia-fs seems to be related to mofed.
If my cuda program runs directly on the host machine, can nvidia-fs really speed it up?
Download the tgz MOFED package form: http://www.mellanox.com/page/software_overview_ib
Untar it on your node
run ./mlnxofedinstall --add-kernel-support --skip-repo
No reboot laptop!
Still got an error when installing nvidia_fs:
sudo bash cuda_12.4.1_550.54.15_linux.run
Secure Boot not enabled on this system.
modprobe: ERROR: could not insert 'nvidia_fs': Unknown symbol in module, or unknown parameter (see dmesg)
Installation failed. See log at /var/log/cuda-installer.log for details.
cat /var/log/cuda-installer.log show this :
INFO]: Driver installation detected by command: apt list --installed | grep -e nvidia-driver-[0-9][0-9][0-9] -e nvidia-[0-9][0-9][0-9]
[INFO]: Cleaning up window
[INFO]: Complete
[INFO]: Checking compiler version...
[INFO]: gcc location: /usr/bin/gcc
[INFO]: gcc version: gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.3)
[INFO]: Initializing menu
[INFO]: nvidia-fs.setKOVersion(2.19.7)
[INFO]: Setup complete
[INFO]: Installing: Kernel Objects
[INFO]: cudaItem::install - kernelobjects
[INFO]: Able to write to /usr/local/kernelobjects/
[INFO]: Installing: nvidia-fs
[INFO]: cudaItem::install - kernelobjects
[INFO]: dkms found
[INFO]: DKMS is installed,proceed with dkms install
[INFO]: previous version of nvidia-fs is not installed, nvidia-fs version: 2.19.7 will be installed.
[INFO]: getting mofed Status
[INFO]: Installing nvidia-fs version: 2.19.7.../usr/src/nvidia-fs-2.19.7/
[INFO]: Module nvidia-fs with version 2.19.7 will be installed.
[INFO]: install path is writable, proceeding with file copy
[INFO]: Module nvidia-fs could not be not installed
[ERROR]: Install of nvidia-fs failed, quitting
After reboot, I try again.
Run sudo bash cuda_12.4.1_550.54.15_linux.run
errors:
Error! Could not locate dkms.conf file.
File: /var/lib/dkms/nvidia-fs/2.19.7/source/dkms.conf does not exist.
Error! DKMS tree already contains: nvidia-fs-2.19.7
You cannot add the same module/version combo more than once.
modprobe: ERROR: could not insert 'nvidia_fs': Unknown symbol in module, or unknown parameter (see dmesg)
Installation failed. See log at /var/log/cuda-installer.log for details.
run dpkg -l |grep nvidia show:
ii libnvidia-cfg1-535:amd64 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA binary OpenGL/GLX configuration library
ii libnvidia-common-535 535.183.01-0ubuntu0.20.04.1 all Shared files used by the NVIDIA libraries
rc libnvidia-compute-450:amd64 450.51.05-0ubuntu1 amd64 NVIDIA libcompute package
rc libnvidia-compute-450-server:amd64 450.248.02-0ubuntu0.20.04.1 amd64 NVIDIA libcompute package
rc libnvidia-compute-470:amd64 470.42.01-0ubuntu1 amd64 NVIDIA libcompute package
rc libnvidia-compute-525:amd64 525.125.06-0ubuntu0.20.04.3 amd64 NVIDIA libcompute package
ii libnvidia-compute-535:amd64 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA libcompute package
ii libnvidia-compute-535:i386 535.183.01-0ubuntu0.20.04.1 i386 NVIDIA libcompute package
rc libnvidia-compute-535-server:amd64 535.104.05-0ubuntu0.20.04.1 amd64 NVIDIA libcompute package
ii libnvidia-decode-535:amd64 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA Video Decoding runtime libraries
ii libnvidia-decode-535:i386 535.183.01-0ubuntu0.20.04.1 i386 NVIDIA Video Decoding runtime libraries
ii libnvidia-encode-535:amd64 535.183.01-0ubuntu0.20.04.1 amd64 NVENC Video Encoding runtime library
ii libnvidia-encode-535:i386 535.183.01-0ubuntu0.20.04.1 i386 NVENC Video Encoding runtime library
ii libnvidia-extra-535:amd64 535.183.01-0ubuntu0.20.04.1 amd64 Extra libraries for the NVIDIA driver
ii libnvidia-fbc1-535:amd64 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
ii libnvidia-fbc1-535:i386 535.183.01-0ubuntu0.20.04.1 i386 NVIDIA OpenGL-based Framebuffer Capture runtime library
ii libnvidia-gl-535:amd64 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii libnvidia-gl-535:i386 535.183.01-0ubuntu0.20.04.1 i386 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
rc nvidia-compute-utils-450-server 450.248.02-0ubuntu0.20.04.1 amd64 NVIDIA compute utilities
rc nvidia-compute-utils-470 470.42.01-0ubuntu1 amd64 NVIDIA compute utilities
rc nvidia-compute-utils-525 525.125.06-0ubuntu0.20.04.3 amd64 NVIDIA compute utilities
ii nvidia-compute-utils-535 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA compute utilities
rc nvidia-cuda-toolkit 10.1.243-3 amd64 NVIDIA CUDA development toolkit
rc nvidia-dkms-450 450.51.05-0ubuntu1 amd64 NVIDIA DKMS package
rc nvidia-dkms-450-server 450.248.02-0ubuntu0.20.04.1 amd64 NVIDIA DKMS package
rc nvidia-dkms-470 470.42.01-0ubuntu1 amd64 NVIDIA DKMS package
rc nvidia-dkms-525 525.125.06-0ubuntu0.20.04.3 amd64 NVIDIA DKMS package
ii nvidia-dkms-535 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA DKMS package
ii nvidia-driver-535 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA driver metapackage
ii nvidia-firmware-535-535.183.01 535.183.01-0ubuntu0.20.04.1 amd64 Firmware files used by the kernel module
rc nvidia-kernel-common-450 450.51.05-0ubuntu1 amd64 Shared files used with the kernel module
rc nvidia-kernel-common-450-server 450.248.02-0ubuntu0.20.04.1 amd64 Shared files used with the kernel module
rc nvidia-kernel-common-470 470.42.01-0ubuntu1 amd64 Shared files used with the kernel module
rc nvidia-kernel-common-525 525.125.06-0ubuntu0.20.04.3 amd64 Shared files used with the kernel module
ii nvidia-kernel-common-535 535.183.01-0ubuntu0.20.04.1 amd64 Shared files used with the kernel module
ii nvidia-kernel-source-535 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA kernel source package
ii nvidia-prime 0.8.16~0.20.04.2 all Tools to enable NVIDIA's Prime
ii nvidia-settings 470.57.01-0ubuntu0.20.04.3 amd64 Tool for configuring the NVIDIA graphics driver
ii nvidia-utils-535 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA driver support binaries
ii screen-resolution-extra 0.18build1 all Extension for the nvidia-settings control panel
ii xserver-xorg-video-nvidia-535 535.183.01-0ubuntu0.20.04.1 amd64 NVIDIA binary Xorg driver