I am using an AWS instance(p2.xlarge) for my GPU experiments. I downloaded CUDA using ‘wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run
chmod +x cuda_9.0.176_384.81_linux-run’. I was able to successfully install the driver and work with it. When I stop the AWS instance and then restart it the next day, ./deviceQuery is giving the following error
[b]’
./deviceQuery Starting…
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 30
→ unknown error
Result = FAIL’
[/b]
and I am not able to use the GPU.
Pasting below the output when I gave ‘nvcc --version’ :-
‘nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176’
Pasting below the output when I gave ‘nvidia-smi’:-
‘NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.’
Please Help me solve the issue