I pulled the latest pytorch image:nvcr.io/nvidia/l4t-pytorch:r32.5.0-pth1.7-py3
and started it using --runtime nvidia
I am trying to build a library that uses cuda ( [iou3d_nms)CenterPoint/det3d/ops/iou3d_nms/src at master · tianweiy/CenterPoint · GitHub)]
but it says that no GPU is found and no cuda runtime is found
RuntimeError: No CUDA GPUs are available
No CUDA runtime is found, using CUDA_HOME=’/usr/local/cuda’
Hi @yosha.morheg, do you get that error when you are compiling the library, or when you are trying to run it?
If you import torch
and run torch.cuda.is_available()
inside container, does it detect the GPU?
On your Xavier, do you see the CUDA toolkit installed under /usr/local/cuda
?
I got this error while trying to compile it.
torch.cuda.is_available() returns false
Yes, CUDA toolkit is installed under /usr/local/cuda
Can you run this outside of container?
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
Then inside the container, run this:
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
./deviceQuery
Does it report the GPU successfully? Does torch.cuda.is_available()
return true after you ran deviceQuery in the container?
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 100
-> no CUDA-capable device is detected
Result = FAIL
It fails to detect the gpu inside the container
The user settings in the Dockerfile are causing this problem
Any suggestion ?
with out adding a user everything works well!
Thank you for your response!
That was the problem! I fixed it!