Jetson Xavier NX (emmc), CUDA, and Docker

Hi NVIDIA team,

My company and I are using the Xavier NX module for our robot platform, but we are having issues loading our docker image onto the module in order to run our environment. Much of this is related to the 16GB storage space, but we might be able to overcome this with a bit more understanding of the system. On the FAQ, I saw a link to this: Jetson/L4T/Boot From External Device -, but I am still wondering what is the best way to think about the interaction between the module, CUDA, and docker.

The first attempt was to treat the Xavier NX module (emmc) just like the Developer Kit - install CUDA and all SDK components onto it, then try to load the docker container (which also contained CUDA). After flashing everything onto the emmc module, there were something like 6GB of storage space remaining, and our image was less than 5GB, so I figured this would be suitable. However, it seems that the way docker images are loaded requires something like double the actual space they take up, so this was not even close to able to work.

Because CUDA was already installed within our docker image, my next thought was to not install it (or any other SDK components) on the Xavier NX module when flashing the OS onto it. After a fresh flashing of the OS, loading the docker image eventually worked. However, when I tried to then run our CUDA-dependent code within the docker, an error was thrown that meant, at least to me, that it wouldn’t work unless CUDA is flashed onto the “host” system.

Then, I thought that I would try to save the space within the docker image by removing CUDA from it. However, when I tried to run the commands which I have previously used to uninstall CUDA (sudo apt-get --purge remove [nsight, cublas, cuda, nvidia]), it indicated that most of these were not installed in the first place even though /usr/local/cuda and associated folders still existed. Does this mean that CUDA was actually not installed inside the docker, but was just using the “host” machine’s installation? When I tried this, I was using the Developer Kit, which had CUDA installed in “both” places.

So, with this background, my main questions are: 1) How should we approach running CUDA-dependent code from within a docker image on Jetson Xavier NX? 2) Should CUDA be installed outside the docker container (on the host), inside the container/image, or both? 3) Will future versions of similar Jetson modules have larger storage space, so these issues will be easier to overcome?

Please let me know if you have any insight into these matters.

Best wishes,

Daniel Freer


In general, we install the CUDA library on the Jeston directly.
And mount the library into the docker for saving the size.

This can be done by launching the docker with --runtime nvidia.
You can find some examples to create a container for Jetson below:


Hi AastaLLL,

Thanks for your reply. Yes, I actually used these containers as the base environment, but modified it to include other packages and SDKs that are necessary (such as ROS and pytorch), and I also used the --runtime nvidia command when creating a new container. I eventually got an image/container that worked on Xavier Development Kit, so commited the container and saved the docker image to a tar.gz file so that I could upload it to Xavier NX (emmc). When I ran this image to create a new docker container on the emmc version, cuda and cuda-10.2 are still inside the container (at /usr/local/), but it seems like they can’t be detected. Should I remove these in order to save space in the docker image, so that I can install CUDA via the flashing process? What is the best way to remove them (apt --purge remove did not work)? Does using the --runtime nvidia command actually alter the docker container/image the next time you commit it?



If you use --runtime nvidia with the container shared above.
You can find the CUDA library in /usr/local/ since docker has mounted it from the Jetson host.
And it cannot be removed since it is mounted rather than installed.

It’s recommended to install CUDA via the flashing process.
And access it within docker through mounting.
This can reduce image size as well as avoid some CUDA dependency issues.