Compiling darknet in l4t-base:r32.3.1 docker image , get error : cicc: not found

Hello,

Platform: Jetson nano
Jetpack: latest

I have this simple Dockerfile to try to build an image with darknet installed

FROM nvcr.io/nvidia/l4t-base:r32.3.1

RUN  apt-get update -y && apt-get install -y  pkg-config \
 zlib1g-dev  libwebp-dev \
 libtbb2 libtbb-dev  \
 libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev libv4l-dev \
 cmake
RUN apt-get install -y \
  autoconf \
  autotools-dev \
  build-essential \
  gcc \
  git

RUN git clone --depth 1 https://github.com/AlexeyAB/darknet /var/local/darknet

WORKDIR /var/local/darknet

#Set GPU to 1
RUN sed -i -e s/GPU=0/GPU=1/ Makefile;

RUN make

But unfortunately when I try to build the image, I get this error when it reach the point to compile darknet:

sh: 1: cicc: not found
Makefile:168: recipe for target 'obj/convolutional_kernels.o' failed
make: *** [obj/convolutional_kernels.o] Error 127

I tried to google around but didn’t find an answer, any idea ?

Thanks

Thibault

Hi,

Thanks for your reporting.

We are trying to reproduce this issue.
Will update more information with you asap.

Ok, I just also tried with nvcr.io/nvidia/deepstream-l4t:4.0.2-19.12-base image and I get the same error

Hi,

There are some issues when accessing the cicc in l4t-base:r32.3.1 image.

We are checking this internally.
Will update more information with you later.

Thanks.

Hi,

Thanks for the answer ;-) . And thanks for you pinging me when you figure out a fix for this.

Best,

Thibault

Hi,

Thanks for your patience.

This issue is still under checking.
The latest status is that we can find other binary (ex. nvcc) in the l4t-base:r32.3.1 image.
However, we cannot find the corresponding cicc binary.

Will update more information with you once we got any progress.
Thanks.

Thanks for the update ! And good luck debugging this ;-)

Hi,

This issue comes from a limitation of our docker image.

https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-Container-Runtime-on-Jetson#usrlocalcuda-is-readonly

/usr/local/cuda is readonly

One of the limitations of the beta is that we are mounting the cuda directory from the host. This was done with size in mind as a development CUDA container weighs 3GB, on Nano it’s not always possible to afford such a huge cost. We are currently working towards creating smaller CUDA containers.

As a result, CUDA library and tool are mounted to the container when launch.
So the cicc is only accessible in “docker run” and cannot be accessed when building.

To overcome this, you will need to copy the CUDA toolkit into the container.

Thanks.

Hi,

Ok thanks ! I guess I will build darknet outside the docker image and copy the compiled version for now.

Do you know if this will be possible with the next version of nvidia-docker ? Or is it something that will go later on ?

Thanks.

Hi,

Sorry that we cannot disclosure a concrete schedule here.
But we are trying to extract the essential part of CUDA toolkit for the l4t docker image.
Will let you know once it is public available.

Thanks.

Hi, I didn’t test yet but do you think the new version nvidia-l4t-base r32.4.2 solves this ? (I guess based on jetpack 4.4 DP)

Hi,

Sorry that this issue isn’t solved in nvcr.io/nvidia/l4t-base:r32.4.2.

Have a discussion with our internal team.
Since the folder doesn’t be mounted when the building time, there are two possible workaround for this.

  1. Copy the entire CUDA directory in the Dockerfile.
    You can delete the directory at the end to reduce the image size.

  2. Compile the app on the host and copy it into the container.

Thanks.

Hi

Thanks for looking into this.

  1. Do you have an example of copying the CUDA directory in the Dockerfile and then deleting it ?

  2. Yes this is what I’m doing right now, but workaround 1. seems better

Thanks

Hi,

Our developer suggest another workaround:

Another possible workaround : depending on what part of Dockerfile fails, one can remove failing portion from Dockerfile, build the container, then launch it, execude build steps removed from Dockerfile (which would now succeed), then commit the container.

Maybe this should be a better way to go.
Thanks.

Hi, I’m sorry, but I don’t quite understand this. In the l4t-base.r32.3.1 image the /usr/local/cuda directory is writeable, and it’s local to the container. For example, I can create the following directory /usr/local/cuda/a and on the other hand it’s not reflected on the host file system, so it’s internal to the container.

Could you please elaborate what you mean by saying that the /usr/local/cuda directory is readonly? Reaonly how?

Thank you.
Best regards,

Frankly, this is a terrible way to go. The purpose of a Dockerfile is to encode the steps required to build an image. Having to manually run build steps that ought to reside within the Dockerfile itself is not the way to go.

Committing is generally not considered best practice. Consider scenarios where you push your Dockerfile to a remote server which will build the image before deploying it to a different server.

There’s an alternative, which is writing a script to execute build steps on deployment. But that would defeat the purpose of a containerized approach.