docker build starts ignoring default runtime setting after initially working

After flashing my device with Jetpack 4.3, I am able to set the default docker runtime to nvidia via /etc/docker/daemon.json. This is so I have access to nvidia tooling such as cicc when running docker build. Initially, this works perfectly fine. However, at some point this appears to stop working.

A simple test is the following Dockerfile:

FROM nvcr.io/nvidia/l4t-base:r32.3.1
# Make sure cuda compiler is present
RUN test -e /usr/local/cuda/nvvm/bin/cicc

Initially this runs and builds an image. When the issue starts presenting, building the image fails (also the image is rebuilt instead of using the cached image, which is another indicator that the runtime has changed).

Notably, docker run still works using the correct default runtime setting.

Followup, the issue only seems to occur when setting the DOCKER_BUILDKIT=1 environment variable.

Hi,
We have a release for Jetson platforms:

Please give it a try.

Hi Dane,

I am using the container runtime from the SDK manager. The issue is that using docker buildkit does not use nvidia runtime, even if it is configured to in /etc/docker/daemon.json.

Thanks,
Daniel

Hi,

This looks like an authority issue to me.
Could you check if this comment also works for you?
https://devtalk.nvidia.com/default/topic/1069993/jetson-nano/nemo-toolkit-on-jetson-nano/post/5420605/#5420605

Thanks.

Hi Aasta,

Unfortunately running the command in that comment did not help.

Thanks,
Daniel

Hi,

Please try this Dockerfile:

FROM nvcr.io/nvidia/l4t-base:r32.3.1
# Make sure cuda compiler is present
CMD ["test -e /usr/local/cuda/nvvm/bin/cicc"]

Thanks.

When running a container build with that Dockerfile I get:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused “exec: "test -e /usr/local/cuda/nvvm/bin/cicc": stat test -e /usr/local/cuda/nvvm/bin/cicc: no such file or directory”: unknown.

Hi,

Have you changed any software from the initial installation?

We can run the Dockerfile without issue on a JetPack4.3 environment.
Would you mind to give it a try again?

Thanks.

Oh, I mistyped the last Dockerfile. I used RUN instead of CMD for the last line. It builds fine with CMD, but that doesn’t address my issue, which is not being able to access that binary during docker build.

Hi,

Sorry for the delay.

Please noticed that a binary that calls into GPU driver is not able to be run on an x86 host.
However, if you are using a Jetson device, it should be fine.

May I know the system you are using?
Thanks.

This is on a Jetson device, specifically the AGX Xavier.

Hi,

We found this issue specifies for the cicc.

The nvcc binary can be accessed without issue:

FROM nvcr.io/nvidia/l4t-base:r32.3.1
  
RUN test -e /usr/local/cuda/bin/nvcc

We are passing this issue to our internal team.
Will update more information with you later.

Thanks.