What is unique about the Nvidia container runtime? Does CDI replace it?

I have been trying to understand what is unique about the Nvidia container runtime. The container toolkit installs it as a possible alternate runtime in docker or containerd config. It appears to be a modified version of runc.

What is “modified” about it? What does it do that regular upstream runc does not?

Relatedly, version 1.12.0 and above support using container device interface; does using CDI obviate the forked runtime (which would be great)?

I have a few more questions on containers; will put them in their own topics

Hi,

We need to check with our internal team.
Will update more info with you later.

Thanks.

1 Like

I found this excellent post from 2.5 years ago on a containerd issue. Post is by @klueska1 ; is it still up to date? And what is the long term goal. Is it to focus on CDI, or keep this alive? I imagine this is a maintenance headache, as the interface has to match the runc one precisely?

Hi,

Here is the info from our internal team.

The NVIDIA Container Runtime is a wrapper around runc ​ and does not replace the low-level runtime such as runc​.
It makes modifications to the incoming OCI runtime specification before invoking runc.

Moving to CDI could allow the NVIDIA Container Runtime to be replaced, assuming that a container runtime that supports CDI is being used.
Note that the NVIDIA Container Toolkit includes an nvidia-ctk ​ command line tool that allows CDI specs to be generated for supported NVIDIA platforms with support for Jetson systems being added in version 1.14.0.

Thanks.

1 Like

This is pretty helpful, thank you.

The point about CDI support is especially important. Podman mostly does already, but containerd only partially. Client-side as of 2.0, not sure about server-side.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.