Jetson Xavier NX (emmc), CUDA, and Docker

Hi NVIDIA team,

My company and I are using the Xavier NX module for our robot platform, but we are having issues loading our docker image onto the module in order to run our environment. Much of this is related to the 16GB storage space, but we might be able to overcome this with a bit more understanding of the system. On the FAQ, I saw a link to this: Jetson/L4T/Boot From External Device - eLinux.org, but I am still wondering what is the best way to think about the interaction between the module, CUDA, and docker.

The first attempt was to treat the Xavier NX module (emmc) just like the Developer Kit - install CUDA and all SDK components onto it, then try to load the docker container (which also contained CUDA). After flashing everything onto the emmc module, there were something like 6GB of storage space remaining, and our image was less than 5GB, so I figured this would be suitable. However, it seems that the way docker images are loaded requires something like double the actual space they take up, so this was not even close to able to work.

Because CUDA was already installed within our docker image, my next thought was to not install it (or any other SDK components) on the Xavier NX module when flashing the OS onto it. After a fresh flashing of the OS, loading the docker image eventually worked. However, when I tried to then run our CUDA-dependent code within the docker, an error was thrown that meant, at least to me, that it wouldn’t work unless CUDA is flashed onto the “host” system.

Then, I thought that I would try to save the space within the docker image by removing CUDA from it. However, when I tried to run the commands which I have previously used to uninstall CUDA (sudo apt-get --purge remove [nsight, cublas, cuda, nvidia]), it indicated that most of these were not installed in the first place even though /usr/local/cuda and associated folders still existed. Does this mean that CUDA was actually not installed inside the docker, but was just using the “host” machine’s installation? When I tried this, I was using the Developer Kit, which had CUDA installed in “both” places.

So, with this background, my main questions are: 1) How should we approach running CUDA-dependent code from within a docker image on Jetson Xavier NX? 2) Should CUDA be installed outside the docker container (on the host), inside the container/image, or both? 3) Will future versions of similar Jetson modules have larger storage space, so these issues will be easier to overcome?

Please let me know if you have any insight into these matters.

Best wishes,

Daniel Freer

Hi,

In general, we install the CUDA library on the Jeston directly.
And mount the library into the docker for saving the size.

This can be done by launching the docker with --runtime nvidia.
You can find some examples to create a container for Jetson below:

Thanks.

Hi AastaLLL,

Thanks for your reply. Yes, I actually used these containers as the base environment, but modified it to include other packages and SDKs that are necessary (such as ROS and pytorch), and I also used the --runtime nvidia command when creating a new container. I eventually got an image/container that worked on Xavier Development Kit, so commited the container and saved the docker image to a tar.gz file so that I could upload it to Xavier NX (emmc). When I ran this image to create a new docker container on the emmc version, cuda and cuda-10.2 are still inside the container (at /usr/local/), but it seems like they can’t be detected. Should I remove these in order to save space in the docker image, so that I can install CUDA via the flashing process? What is the best way to remove them (apt --purge remove did not work)? Does using the --runtime nvidia command actually alter the docker container/image the next time you commit it?

Cheers.

Hi,

If you use --runtime nvidia with the container shared above.
You can find the CUDA library in /usr/local/ since docker has mounted it from the Jetson host.
And it cannot be removed since it is mounted rather than installed.

It’s recommended to install CUDA via the flashing process.
And access it within docker through mounting.
This can reduce image size as well as avoid some CUDA dependency issues.

Thanks.

Hi AastaLLL,

Thanks for your reply. Now, after re-flashing all components (including CUDA, TensorRT, all Vision Tools, etc.) from the SDK manager to the module, there was no space at all remaining on the module. I was able to free some up by removing all of the apt cache, but this still only freed up about 1 GB, not nearly enough for ROS, pytorch, or any other crucial packages or code. It truly seems to me that 16GB just isn’t enough storage space to successfully run a CUDA-based AI system on the module for robotics applications. Could you please confirm whether there is a way to access more storage space when using the Xavier NX module (not the developer kit), or if there are any components of the OS, CUDA, or things that are “flashed” onto the system which are unnecessary and take up a large amount of space? We are also using TensorRT in our desktop version, so would prefer to keep this installed as well.

Best wishes,

Daniel Freer

Hi,

There are some possible alternatives for this issue.
1. Add an external SSD to enlarge storage space

2. Manually install the package you needed.
You can find an example for Deepstream below:

Thanks.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.