What does the CDI config via nvidia-ctk, and why does it mount so libs in the container?

avi24 · October 2, 2023, 11:29am

I am trying to understand what the Nvidia CDI hooks do, as configured by nvidia-ctk cdi generate.

Having analyzed the generated nvidia.yaml, it mainly does the following:

mounts lots of devices - makes sense
mounts lots of library paths
creates lots of symlinks

First, since the approach from more recent versions of the container images is not to mount lots of libs (like cuda) in, but rather to have them inside the container (much more aligned with the container philosophy), why is it finding all of these hundreds of libraries on the host OS and mounting them in, as opposed to having them available as part of the container image?

For that matter, how can it even know that they exist? There is no guarantee that the base OS will be 100% jetpack.

Second, are all of the links sections just adding various links needed inside the container to many of those mounted libraries? Coming back to the “container philosophy”, why would those not already exist in the container image?

AastaLLL · October 3, 2023, 3:58am

Hi,

Do you have the list of the mounted libraries?

Suppose these are some hardware-dependent libraries rather than the user space libs like CUDA.
These libraries are included in the OS (ex. r35.4.1) and need to be mounted to ensure functionality.

Thanks.

avi24 · October 3, 2023, 8:13am

Sure. I can make a generated CDI file from a Xavier NX public. I just ran nvidia-ctk cdi generate and the results are from the nvidia.yaml file.

It is 3367 lines long, so I doubt I can paste it directly here, but I can put it in a gist.
See this link.

The particularly interesting section for mounts - besides devices, of course - is from lines 149 to 1684, or this link.

avi24 · October 3, 2023, 3:45pm

I just read the libnvidia-container architecture doc in depth. It explains some of the dependencies better, but now I am confused as to how the requirements described work with CDI.

AastaLLL · October 4, 2023, 8:28am

Hi,

The libraries under /usr/lib/aarch64-linux-gnu/tegra/ are Jetson specific libraries.
For example, libnvdla_compiler.so is related to the DLA hardware.

Thanks.

avi24 · October 4, 2023, 8:54am

Meaning they are Jetson in general? Or specific to this particular version of the hardware (in this case a specific version of a Xavier NX board)?

AastaLLL · October 5, 2023, 6:45am

Hi,

The library needs to be compatible with the OS (e.g. r35.4).
Some libraries are also different cross platforms.

Thanks.

avi24 · October 5, 2023, 8:14am

Is it compatible with the OS, or compatible with the specific GPU driver? I read the architecture doc, which implied the latter, but I am unclear.

Either way, I think you are saying, “if you want to use a Jetson, you need the device (obviously), the correct kernel and driver for the specific GPU on the device, and very specific versions of user-space libraries that match that specific GPU and specific kernel drivers.” Is that correct? If so, is there a maintained mapping somewhere that says, “device → kernel driver → userspace libs” for each one?

In development, it is somewhat reasonable to assume everything will just use the ready-to-run Jetpack; in production, systems generally are very tightly controlled and not running these full-blown OSes and packages. We need some way to get the right elements on there.

AastaLLL · October 12, 2023, 5:56am

Hi,

For L4T, the GPU driver and some hardware-related driver is integrated into the OS.
These need to be compatible (from the same L4T branch) so mounting is a good way to ensure this.

But user space libraries, like CUDA, and cuDNN, might not have such constraints.
Thanks.

avi24 · October 12, 2023, 7:36am

is integrated into the OS.
These need to be compatible (from the same L4T branch) so mounting is a good way to ensure this

That makes sense. But what do you do when you aren’t using the full-blown JetPack distribution? Sure, it is great in development, but in secure production, lots of places are going to use their own hardened and custom-built OS. How do I get the right libraries for the specific kernel version, driver version, and hardware?

user space libraries, like CUDA, and cuDNN, might not have such constraints

That part wasn’t so clear to me, maybe even different, based on the architecture doc.

AastaLLL · October 16, 2023, 6:30am

Hi,

We do provide L4T RootFileSystem so users can build their custom OS:
https://docs.nvidia.com/jetson/archives/r35.4.1/DeveloperGuide/text/SD/RootFileSystem.html

When you set up the device, there are two steps: “flash” and “install components”.
The package installed at the “install components” stage is the user space package.

Thanks.

avi24 · October 16, 2023, 6:52am

Hi, thanks for the answer.

Using a pre-existing root filesystem based on Ubuntu is not “custom OS”; it is just Ubuntu with some package changes. What do you do when an enterprise has standardized on a specific build of ArchLinux or Alpine or RHEL that allows no customized filesystems? You absolutely must use that filesystem+kernel (i.e. operating system) when deploying anything in production? Sure, there are ways to get specific files approved for adding to the “blessed golden image”, but an entire rootfs generated by some outside script? Not a chance.

Let’s address that scenario: I am an enterprise, I have my own custom OS build that will never be replaced by your root filesystem or kernel. I can install certain binaries and kernel drivers and userspace libraries, but that is it.

drivers are covered (at least as far as I can tell)
userspace binaries are covered (all are OSS on GitHub, I managed to build them)
userspace libraries: unsuccessful.

For this to work, you need to make available either the source to userspace libraries that are required at the host level, or such libraries as standalone binaries with clear lists of dependencies and when/where they work.

AastaLLL · October 17, 2023, 5:59am

Hi,

Unfortunately, we now only support such use cases.
Below is an example from Yocto for your reference:

Thanks.

avi24 · October 17, 2023, 5:48pm

I am a little lost. Why the yocto reference?

AastaLLL · October 18, 2023, 7:34am

Hi,

We currently only support Linux-based custom OS.
Yocto is a good sample to build with our rootfs.

Thanks.

avi24 · October 18, 2023, 7:46am

The rootfs provided is Ubuntu-based, which is but a single distribution (“distro”) of the hundreds (thousands?) of Linux-based. Is it more correct to state, “we currently only support Ubuntu Linux-based custom OS”?

DaneLLL · October 19, 2023, 4:19am

Hi,
You are correct that minimum rootfs is based on Ubuntu. You may build your rootfs based on it. Or may check with the contributor for Yocto rootfs.

avi24 · October 19, 2023, 6:52am

Or may check with the contributor for Yocto rootfs.

Do you mind expanding further on this? What do you mean?

system · November 2, 2023, 6:52am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

prlawrence · November 22, 2023, 7:44pm

avi24,
Your question seems to have become more general over the course of this forum issue: how can one use an alternate distro with Jetson?

Until JetPack 6, the main answer we’ve had for custom Linux distro is Yocto – we and Jetson partners have put effort into enabling Yocto support for Jetson. Other than that, NVIDIA simply offers a reference filesystem based on Ubuntu. The Jetson Linux page has the files available for download, e.g., the BSP package which includes conf files for the various combinations of Jetson reference carrier board + Jetson module.

JetPack 6 will enable customers and partners to bring their own kernel and will better enable custom distro support. We will continue to provide an Ubuntu-based reference filesystem and Debian packages. If you need to use a custom distro, part of the task will be as you said above, getting specific files from our reference approved for repackaging and adding to your “blessed golden image.”

Digging into Yocto should be instructive, and of course JetPack 6 Developer Preview is expected at the end of the month. Please open a new forum issue anytime with a specific Question about whatever issue crops up.

Topic		Replies	Views
How to find the library packages for specific board needed to mount into containers? Jetson Xavier NX containers	21	1178	January 22, 2024
How to mount host-files-for-container.d in your custom ubuntu images Jetson Nano docker	5	3364	September 27, 2021
Suggestion to solve Tegra Nvidia-docker issues Jetson Nano cuda , docker	20	4173	October 15, 2021
Disable mount plugins Jetson AGX Xavier cuda , docker , cudnn	5	1197	October 18, 2021
Install Problem CUDA Programming and Performance	32	12828	December 17, 2009
Libnvidia-ml location and source? Jetson Xavier NX containers	15	10878	October 17, 2023
What host directories get mounted with --runtime nvidia? Docker and NVIDIA Docker	3	2760	October 12, 2021
L4t-base docker image Jetson AGX Xavier docker	3	547	October 18, 2021
Can nvidia-docker run on tx2? Jetson TX2	15	5753	October 18, 2021
My developed Docker works properly on jetpack 4.2.1 but does not work on jetpack 4.4 Jetson Nano docker	2	2725	October 18, 2021

What does the CDI config via nvidia-ctk, and why does it mount so libs in the container?

Related topics