NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Hi,

I am Deep Learning AMI (Ubuntu 16.04) Version 24.3 (ami-0efc80b44966d8854) AWS server and I am unable to use GPU with Spacy for training model. When I start spacy trying I get “CUDARuntimeError: cudaErrorNoDevice: no CUDA-capable device is detected” error.

Please suggest.

Thanks in advance!

nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

dkms status
bbswitch, 0.8, 4.4.0-1095-aws, x86_64: installed
nvidia-384, 384.130, 4.4.0-1095-aws, x86_64: installed

uname -a
Linux ip-49-36-195-85 4.4.0-1095-aws #106-Ubuntu SMP Wed Sep 18 13:33:48 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

nvcc --version
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

dpkg -l | grep nvidia
ii libnvidia-container-tools 1.0.5-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.0.5-1 amd64 NVIDIA container runtime library
ii nvidia-384 384.130-0ubuntu0.16.04.2 amd64 NVIDIA binary driver - version 384.130
ii nvidia-container-runtime 3.1.4-1 amd64 NVIDIA container runtime
rc nvidia-container-runtime-hook 1.4.0-1 amd64 NVIDIA container runtime hook
ii nvidia-container-toolkit 1.0.5-1 amd64 NVIDIA container runtime hook
ii nvidia-cuda-dev 7.5.18-0ubuntu1 amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 7.5.18-0ubuntu1 all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 7.5.18-0ubuntu1 amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-toolkit 7.5.18-0ubuntu1 amd64 NVIDIA CUDA development toolkit
ii nvidia-docker2 2.2.2-1 all nvidia-docker CLI wrapper
ii nvidia-opencl-dev:amd64 7.5.18-0ubuntu1 amd64 NVIDIA OpenCL development files
ii nvidia-opencl-icd-384 384.130-0ubuntu0.16.04.2 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA’s Prime
ii nvidia-profiler 7.5.18-0ubuntu1 amd64 NVIDIA Profiler for CUDA and OpenCL
ii nvidia-settings 361.42-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
ii nvidia-visual-profiler 7.5.18-0ubuntu1 amd64 NVIDIA Visual Profiler for CUDA and OpenCL

inxi -G
Graphics: Card: Amazon.com Device 1111
Display Server: N/A drivers: vesa (unloaded: fbdev) tty size: 166x41 Advanced Data: N/A out of X

lspci | egrep ‘VGA|3D’
00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111

whereis nvidia
nvidia: /usr/share/nvidia /usr/src/nvidia-384-384.130/nvidia

modinfo nvidia_384
filename: /lib/modules/4.4.0-1095-aws/updates/dkms/nvidia_384.ko
alias: char-major-195-*
version: 384.130
supported: external
license: NVIDIA
srcversion: 586B4E0CF4083A6BC41A33B
alias: pci:v000010DEd00000E00svsdbc04sc80i00*
alias: pci:v000010DEdsvsdbc03sc02i00
alias: pci:v000010DEdsvsdbc03sc00i00
depends:
retpoline: Y
vermagic: 4.4.0-1095-aws SMP mod_unload modversions
parm: NVreg_Mobile:int
parm: NVreg_ResmanDebugLevel:int
parm: NVreg_RmLogonRC:int
parm: NVreg_ModifyDeviceFiles:int
parm: NVreg_DeviceFileUID:int
parm: NVreg_DeviceFileGID:int
parm: NVreg_DeviceFileMode:int
parm: NVreg_UpdateMemoryTypes:int
parm: NVreg_InitializeSystemMemoryAllocations:int
parm: NVreg_UsePageAttributeTable:int
parm: NVreg_MapRegistersEarly:int
parm: NVreg_RegisterForACPIEvents:int
parm: NVreg_CheckPCIConfigSpace:int
parm: NVreg_EnablePCIeGen3:int
parm: NVreg_EnableMSI:int
parm: NVreg_TCEBypassMode:int
parm: NVreg_UseThreadedInterrupts:int
parm: NVreg_EnableStreamMemOPs:int
parm: NVreg_MemoryPoolSize:int
parm: NVreg_RegistryDwords:charp
parm: NVreg_RegistryDwordsPerDevice:charp
parm: NVreg_RmMsg:charp
parm: NVreg_AssignGpus:charp

dpkg -l | grep nvidia
ii libnvidia-container-tools 1.0.5-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.0.5-1 amd64 NVIDIA container runtime library
ii nvidia-384 384.130-0ubuntu0.16.04.2 amd64 NVIDIA binary driver - version 384.130
ii nvidia-container-runtime 3.1.4-1 amd64 NVIDIA container runtime
rc nvidia-container-runtime-hook 1.4.0-1 amd64 NVIDIA container runtime hook
ii nvidia-container-toolkit 1.0.5-1 amd64 NVIDIA container runtime hook
ii nvidia-cuda-dev 7.5.18-0ubuntu1 amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 7.5.18-0ubuntu1 all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 7.5.18-0ubuntu1 amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-toolkit 7.5.18-0ubuntu1 amd64 NVIDIA CUDA development toolkit
ii nvidia-docker2 2.2.2-1 all nvidia-docker CLI wrapper
ii nvidia-opencl-dev:amd64 7.5.18-0ubuntu1 amd64 NVIDIA OpenCL development files
ii nvidia-opencl-icd-384 384.130-0ubuntu0.16.04.2 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA’s Prime
ii nvidia-profiler 7.5.18-0ubuntu1 amd64 NVIDIA Profiler for CUDA and OpenCL
ii nvidia-settings 361.42-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
ii nvidia-visual-profiler 7.5.18-0ubuntu1 amd64 NVIDIA Visual Profiler for CUDA and OpenCL

Hello,

as evident from the device listing, NVIDIA GPU is not allocated to your AWS instance. Please configure your AWS VM with a GPU instance.

lspci | egrep 'VGA|3D'
00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111