Why it doesn't see CUDA docker container?

Hi, Team!
We haveJetson Nano 4GB
Ubuntu 18.04
JetPack 4.6.0
OpenCV 4.5.3

When we run such command, CUDA works great:

run-jetson-docker-nvidia:
	docker run -it --rm --runtime nvidia --network host -v $(pwd):/app/src -v /usr/local/cuda-10.2/:/usr/local/cuda-10.2/:ro -v /usr/lib/aarch64-linux-gnu/:/usr/lib/aarch64-linux-gnu -v /usr/include/aarch64-linux-gnu:/usr/include/aarch64-linux-gnu jetson-build bash

But with out mount attributes for CUDA manually in this command - it doesn’t work. We will have error:

>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 196, in <module>
    _load_global_deps()
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 149, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory

When we run docker run -it --rm --runtime nvidia --gpus all nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3 bash, we have empty files for cudnn, so CUDA doesnt mount in Docker container:

root@ec3f7b8ddc55:/usr/include/aarch64-linux-gnu# ls
NvCaffeParser.h       NvInferRuntime.h        NvUtils.h         cblas.h               cudnn_ops_infer_v8.h  fpu_control.h      sys
NvInfer.h             NvInferRuntimeCommon.h  a.out.h           cudnn_adv_infer_v8.h  cudnn_ops_train_v8.h  gnu
NvInferImpl.h         NvInferVersion.h        asm               cudnn_adv_train_v8.h  cudnn_v8.h            ieee754.h
NvInferLegacyDims.h   NvOnnxConfig.h          bits              cudnn_backend_v8.h    cudnn_version_v8.h    jconfig.h
NvInferPlugin.h       NvOnnxParser.h          c++               cudnn_cnn_infer_v8.h  expat_config.h        openblas_config.h
NvInferPluginUtils.h  NvUffParser.h           cblas-openblas.h  cudnn_cnn_train_v8.h  f77blas.h             python3.6m
root@ec3f7b8ddc55:/usr/include/aarch64-linux-gnu# cat cudnn_v8.h
root@ec3f7b8ddc55:/usr/include/aarch64-linux-gnu# cat cudnn_version_v8.h
root@ec3f7b8ddc55:/usr/include/aarch64-linux-gnu# 

So, where we should find Jetpack on installed Ubuntu on Jetson Nano 4GB ? (/etc/fstab - I don’t have such directory and /media/nvidia/NVME - we dont have such dir too)
Or maybe you have any other ideas to fix this trouble?
Thanks .

@s.pinchuk as per your last post:

something with your nvidia-container-runtime had gotten misconfigured/broken at some point and is no longer mounting CUDA/cuDNN/ect into the container correctly. The most reliable recommendation is to reflash the device with JetPack and check that the --runtime nvidia works properly for you from the beginning.

I have already reinstalled system and I have CUDA on host machine, but it doesn’t mount in docker container.

Can I reinstall or upgrade JetPack from terminal?

@s.pinchuk please try this sequence, starting after flashing the SD card with a fresh install. The SD card image already comes with CUDA Toolkit and the NVIDIA Container Runtime installed. Try it with the latest JetPack 4.6.4 instead.

  1. Flash SD card image
  2. Test CUDA samples outside of container:
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery
  1. Test CUDA samples inside l4t-base container (note the r32.7.1 tag is for JetPack >= 4.6.1)
sudo docker run --runtime nvidia -it --rm nvcr.io/nvidia/l4t-base:r32.7.1
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
./deviceQuery

If step 3 doesn’t work, reflash again to get your system environment in a known-good state.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.