Thank you for the quick reply and pointing out this test script @dusty_nv. It seems the container is having trouble finding the CUDA driver on my system. Here is the output of running your test script after pulling your container with docker pull dustynv/l4t-pytorch:r36.2.0
, starting with docker run --name test -it dustynv/l4t-pytorch:r36.2.0
and typing python3
, on my Jetson R36.2:
print('testing PyTorch...')
testing PyTorch...
import torch
print('PyTorch version: ' + str(torch.__version__))
PyTorch version: 2.2.0
print('CUDA available: ' + str(torch.cuda.is_available()))
CUDA available: False
print('cuDNN version: ' + str(torch.backends.cudnn.version()))
cuDNN version: 8904
print(torch.__config__.show())
PyTorch built with:
- GCC 11.4
- C++ Version: 201703
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: NO AVX
- Build settings: BLAS_INFO=open, BUILD_TYPE=Release, CUDA_VERSION=12.2, CUDNN_VERSION=8.9.4, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -D_GLIBC
XX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_XNNPACK -DSYMBOLICATE_MOB
ILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wna
rrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wn
o-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=
old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-er
rno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=open, TORCH_VERSION=2.2.0, USE_CUDA=ON, U
SE_CUDNN=ON, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=ON, USE_NCCL=0, US
E_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,
# fail if CUDA isn't available
assert(torch.cuda.is_available())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError
# check that version can be parsed
from packaging import version
from os import environ
print('PACKAGING_VERSION=' + str(version.parse(torch.__version__)))
PACKAGING_VERSION=2.2.0
print('TORCH_CUDA_ARCH_LIST=' + environ.get('TORCH_CUDA_ARCH_LIST', 'None') + '\n')
TORCH_CUDA_ARCH_LIST=8.7
# quick cuda tensor test
a = torch.cuda.FloatTensor(2).zero_()
<stdin>:1: UserWarning: The torch.cuda.*DtypeTensor constructors are no longer recommended. It's best to use methods such as torch.tensor(data
, dtype=*, device='cuda') to create tensors. (Triggered internally at /opt/pytorch/torch/csrc/tensor/python_tensor.cpp:83.)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 302, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.co
m/Download/index.aspx
print('Tensor a = ' + str(a))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
b = torch.randn(2).cuda()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 302, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.co
m/Download/index.aspx
print('Tensor b = ' + str(b))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'b' is not defined
c = a + b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
print('Tensor c = ' + str(c))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'c' is not defined
# LAPACK test
print('testing LAPACK (OpenBLAS)...')
testing LAPACK (OpenBLAS)...
a = torch.randn(2, 3, 1, 4, 4)
b = torch.randn(2, 3, 1, 4, 4)
x, lu = torch.linalg.solve(b, a)
print('done testing LAPACK (OpenBLAS)')
done testing LAPACK (OpenBLAS)
# torch.nn test
print('testing torch.nn (cuDNN)...')
testing torch.nn (cuDNN)...
import torch.nn
model = torch.nn.Conv2d(3,3,3)
data = torch.zeros(1,3,10,10)
model = model.cuda()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 911, in cuda
return self._apply(lambda t: t.cuda(device))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 825, in _apply
param_applied = fn(param)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 911, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 302, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.co
m/Download/index.aspx
data = data.cuda()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 302, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.co
m/Download/index.aspx
out = model(data)
#print(out)
print('done testing torch.nn (cuDNN)')
done testing torch.nn (cuDNN)
# CPU test (https://github.com/pytorch/pytorch/issues/47098)
print('testing CPU tensor vector operations...')
testing CPU tensor vector operations...
import torch.nn.functional as F
cpu_x = torch.tensor([12.345])
cpu_y = F.softmax(cpu_x)
<stdin>:1: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
print('Tensor cpu_x = ' + str(cpu_x))
Tensor cpu_x = tensor([12.3450])
print('Tensor softmax = ' + str(cpu_y))
Tensor softmax = tensor([1.])
if cpu_y != 1.0:
... raise ValueError('PyTorch CPU tensor vector test failed (softmax)\n')
...
# https://github.com/pytorch/pytorch/issues/61110
t_32 = torch.ones((3,3), dtype=torch.float32).exp()
t_64 = torch.ones((3,3), dtype=torch.float64).exp()
diff = (t_32 - t_64).abs().sum().item()
print('Tensor exp (float32) = ' + str(t_32))
Tensor exp (float32) = tensor([[2.7183, 2.7183, 2.7183],
[2.7183, 2.7183, 2.7183],
[2.7183, 2.7183, 2.7183]])
print('Tensor exp (float64) = ' + str(t_64))
Tensor exp (float64) = tensor([[2.7183, 2.7183, 2.7183],
[2.7183, 2.7183, 2.7183],
[2.7183, 2.7183, 2.7183]], dtype=torch.float64)
print('Tensor exp (diff) = ' + str(diff))
Tensor exp (diff) = 7.429356050359104e-07
if diff > 0.1:
... raise ValueError(f'PyTorch CPU tensor vector test failed (exp, diff={diff})')
...
... print('PyTorch OK\n')
Outside the container, when I run nvcc --version
on my AGX Orin I get the following:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:08:11_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
My system Jetson version is: nvidia-l4t-core 36.2.0-20231218214829
I have been able to install torch2.2+cuda before using the wheel torch-2.2.0a0+81ea7a4.nv24.01-cp310-cp310-linux_aarch64.whl
, so I believe my system has the a valid CUDA driver? Am I missing some configuration step to be able to use your container with torch2.2+cuda?
Thank you for all your help with this issue!