"torch.cuda" is not available inside the docker container which is running on Jetson Orin

I tried with FROM dustynv/pytorch:1.13-r35.3.1. Using torchvision==0.13.1 in my requirement file. I am facing same runtime error.

“File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1130, in _call_impl
return forward_call(*input, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py”, line 27, in decorate_context
return func(*args, **kwargs)
File “/data/models/torch/hub/ultralytics_yolov5_master/models/common.py”, line 721, in forward
y = non_max_suppression(y if self.dmb else y[0],
File “/data/models/torch/hub/ultralytics_yolov5_master/utils/general.py”, line 959, in non_max_suppression
i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS
File “/usr/local/lib/python3.8/dist-packages/torchvision/ops/boxes.py”, line 40, in nms
_assert_has_ops()
File “/usr/local/lib/python3.8/dist-packages/torchvision/extension.py”, line 33, in _assert_has_ops
raise RuntimeError(
RuntimeError: Couldn’t load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.”

Did you just try installing torchvision with pip? Because that will install torchvision wheel from PyPi that doesn’t have CUDA enabled (and also uninstall your previous PyTorch wheel as you found)

Instead, try building torchvision wheel from source with CUDA enabled like here:

Not able to clone to the torchvision

Step 7/11 : RUN git clone --branch ${TORCHVISION_VERSION} --recursive --depth=1 GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision torchvision && cd torchvision && git checkout ${TORCHVISION_VERSION} && python3 setup.py bdist_wheel && cp dist/torchvision*.whl /opt && pip3 install --no-cache-dir --verbose /opt/torchvision*.whl && cd …/ && rm -rf torchvision

—> Running in 3e61886a6a3e
Cloning into ‘torchvision’…
warning: Could not find remote branch 0.15 to clone.
fatal: Remote branch 0.15 not found in upstream origin

@rakesh.thykkoottathil.jay look at the tags available on https://github.com/pytorch/vision repo - substitute v0.15.1 for ${TORCHVISION_VERSION}

@dusty_nv I installed the torchvision 0.15+cuda as you mentioned above.
But still facing some runtime issues.

File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py”, line 115, in decorate_context
return func(*args, **kwargs)
File “/data/models/torch/hub/ultralytics_yolov5_master/models/common.py”, line 721, in forward
y = non_max_suppression(y if self.dmb else y[0],
File “/data/models/torch/hub/ultralytics_yolov5_master/utils/general.py”, line 959, in non_max_suppression
i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS
File “/usr/local/lib/python3.8/dist-packages/torchvision/ops/boxes.py”, line 41, in nms
return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
File “/usr/local/lib/python3.8/dist-packages/torch/_ops.py”, line 502, in call
return self._op(*args, **kwargs or {})
NotImplementedError: Could not run ‘torchvision::nms’ with arguments from the ‘CUDA’ backend. This could be because the operator doesn’t exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit Internal Login for possible resolutions. ‘torchvision::nms’ is only available for these backends: [CPU, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].

CPU: registered at /app/torchvision/torchvision/csrc/ops/cpu/nms_kernel.cpp:112 [kernel]
QuantizedCPU: registered at /app/torchvision/torchvision/csrc/ops/quantized/cpu/qnms_kernel.cpp:124 [kernel]
BackendSelect: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at /opt/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:144 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /opt/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:491 [backend fallback]
Functionalize: registered at /opt/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:280 [backend fallback]
Named: registered at /opt/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at /opt/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at /opt/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:19 [backend fallback]
ZeroTensor: registered at /opt/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:63 [backend fallback]
AutogradOther: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:30 [backend fallback]
AutogradCPU: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:34 [backend fallback]
AutogradCUDA: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:42 [backend fallback]
AutogradXLA: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:46 [backend fallback]
AutogradMPS: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:54 [backend fallback]
AutogradXPU: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:38 [backend fallback]
AutogradHPU: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:67 [backend fallback]
AutogradLazy: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:50 [backend fallback]
AutogradMeta: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:58 [backend fallback]
Tracer: registered at /opt/pytorch/pytorch/torch/csrc/autograd/TraceTypeManual.cpp:294 [backend fallback]
AutocastCPU: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:487 [backend fallback]
AutocastCUDA: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:354 [backend fallback]
FuncTorchBatched: registered at /opt/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:815 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /opt/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1073 [backend fallback]
VmapMode: fallthrough registered at /opt/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /opt/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:210 [backend fallback]
PythonTLSSnapshot: registered at /opt/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:152 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /opt/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:487 [backend fallback]
PythonDispatcher: registered at /opt/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:148 [backend fallback]

@dusty_nv

I used TORCHVISION_VERSION=“v0.15.1” for cloning git repo
Above build has below torch and torchvision versions
Torch version: 2.0.0+nv23.05
Torchvision version : 0.15.1a0+42759b1’

Hmm, it would not appear it was built with CUDA. You should see stuff about that during the build if it’s enabled.

Could you please guide me on this? I don’t know how to proceed. But torchvision version is showing Torchvision version : 0.15.1a0+42759b1’.
I think this means torchvision is CUDA enabled right??.

I used 0.15.1 as you mentioned.

Build messages from the console. It starts with this:

“Building wheel torchvision-0.15.1a0+42759b1
Compiling extensions with following flags:
FORCE_CUDA: False
DEBUG: False
TORCHVISION_USE_PNG: True
TORCHVISION_USE_JPEG: True
TORCHVISION_USE_NVJPEG: True
TORCHVISION_USE_FFMPEG: True
TORCHVISION_USE_VIDEO_CODEC: True
NVCC_FLAGS:”

One more build message which i found.

setup.py:10: DeprecationWarning: pkg_resources is deprecated as an API. See Package Discovery and Resource Access using pkg_resources - setuptools 68.2.2.post20230912 documentation
from pkg_resources import DistributionNotFound, get_distribution, parse_version
No CUDA runtime is found, using CUDA_HOME=‘/usr/local/cuda’
Emitting ninja build file /app/torchvision/build/temp.linux-aarch64-cpython-38/build.ninja…
Compiling objects…

Sorry, I’m not exactly sure why it happens. Do you see nvcc actually being invoked and used to compile torchvision’s .cu files during the build? You can also try using newer/older tags of torchvision version.

@dusty_nv I think i am missng the CUDA toolkit in the container. There is no message showing nvcc getting started for the compilation.

@rakesh.thykkoottathil.jay do you see it in the container under /usr/local/cuda ?

Also, try setting your default docker runtime to nvidia:

Unless you explicitly set TORCH_CUDA_ARCH_LIST maybe it tries to use CUDA runtime during builds to detect your CUDA SM version, and setting your default docker runtime to nvidia will enable CUDA to be run during builds (i.e. actually using the GPU as opposed to just compiling with nvcc)

@dusty_nv I cannot set docker runtime to nvidia. I am actually building the image using azure container registry build command.

az acr build --registry registry_name --file Dockerfile --platform linux/arm64/v8 --image imagename & tag .

I am setting the runtime to nvidia (In config) when i am pushing the container to Azure IoT edge runtime(in Jetson) through Azure IoT hub.

Here during the build i dont have much flexibility to set that.

Under user/local
usr/local# ls
bin cuda cuda-11 cuda-11.4 etc games include lib man sbin share src
root:/usr/local# cd cuda-11.4
root:/usr/local/cuda-11.4# ls
DOCS EULA.txt README bin compute-sanitizer extras include lib64 nvml nvvm samples share targets tools version.json
root@:/usr/local/cuda-11.4# cd bin/
root@:/usr/local/cuda-11.4/bin# cd nvcc
bash: cd: nvcc: Not a directory
root@:/usr/local/cuda-11.4/bin# ls
bin2c compute-sanitizer crt cu++filt cuda-gdb cuda-gdbserver cuda-install-samples-11.4.sh cudafe++ cuobjdump fatbinary nvcc nvcc.profile nvdisasm nvlink nvprune ptxas

Oh, some packages actually need GPU access during the build. I would try it first on actual Jetson hardware to get the container working.

I used the ENV variable FORCE_CUDA =1 to build it with nvcc
Now i can run the inference.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.