Disable mount plugins

Took me two days to figure out why the filesystem of all docker images I start on the Jetson are messed up: libnvidia-container/mount_plugins.md at jetson · NVIDIA/libnvidia-container · GitHub

How should I deactivate this? Preferably by an argument I can add to docker run --gpus all.
Worst case I have to delete the *.csv files in /etc/nvidia-container-runtime/host-files-for-container.d/ for each Jetson in our CI…

EDIT: apparently those files are provided by nvidia-container-csv-cuda, and without them, CUDA does not work inside the container

Hi,

Could you explain more about your question?
Do you want to enable GPU access without using nvidia runtime?

If yes, please check the following topic:

Thanks.

So our Situation is this:
We are in the process of migrating from CUDA 10.0 and libtorch 1.4.0 + cudnn7 to CUDA 10.2 and TensorRT7 + cudnn8.

Our CI has 6 Jetson AGX, and we want to use a dockerized environment. If we use Jetpack 4.4, and due to the “design decision” to mount files from the host OS into the docker to save space, we get an unwanted dependency to the host OS. Ideally the content of the docker image is defined by the Dockerfile ONLY, and no other files get pushed into the image.

When building and running our current code (CUDA 10.0 and libtorch 1.4.0 + cudnn7) in a nvcr.io/nvidia/l4t-base:r32.2.1 base image on the AGX, we hit the following torch bug: Linux libtorch c++ cmake not compatible with newer versions of cuDNN (for Caffe2) · Issue #40965 · pytorch/pytorch · GitHub

This is because in addition to the installed cudnn7 from the Dockerfile (see below), we also get cudnn8 files from the host OS :-(

Dockerfile

# https://ngc.nvidia.com/catalog/containers/nvidia:l4t-base
FROM nvcr.io/nvidia/l4t-base:r32.2.1
...
RUN wget https://nvidia.box.com/shared/static/ncgzus5o23uck9i5oth2n8n06k340l6k.whl -O torch-1.4.0-cp36-cp36m-linux_aarch64.whl
RUN pip3 install setuptools==42.0.2 wheel==0.34.2
RUN pip3 install numpy==1.17.4
RUN pip3 install torch-1.4.0-cp36-cp36m-linux_aarch64.whl

RUN apt-key adv --fetch-key https://repo.download.nvidia.com/jetson/jetson-ota-public.asc
RUN echo "deb https://repo.download.nvidia.com/jetson/common r32 main" >> /etc/apt/sources.list
RUN apt-get update && apt-get install -y --no-install-recommends \
    cuda-toolkit-10-0 \
    libcudnn7-dev && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

Hi,

Not sure if I understand your question correctly.

Our L4T docker will mount libraries ( ex. CUDA, cuDNN, …) from the host.
So you will have the same CUDA version between container and host.

It do possible to separate the CUDA version from the host.
However, CUDA toolkit has some dependencies to the GPU driver.
This will limit you to run other CUDA library except from original supported one on the Jetson.

Thanks.

Yes, you understood correctly, I want to decouple the dependency to the host as much as possible, to be able to run different cudnn and CUDA versions INSIDE the docker.

Right now we install cudnn7 additionally inside the docker image, and it seems it works so far.