Could not build docker image in DL4AGX

When possible, please provide the following info:
**Hardware Platform: DRIVE AGX Xavier™ Developer Kit
**Software Version: DRIVE Software 10
**Host Machine Version: native Ubuntu 18.04.4
**SDK Manager Version: 1.1.0.6343
Hi,

I want to try out TensorRT by running DL4AGX (GitHub - NVIDIA/DL4AGX: Deep Learning tools and applications for NVIDIA AGX platforms.) project on my DRIVE AGX Xavier. Following the instructions, I downloaded DRIVE Software 10.0 using SDKManager, and the DRIVE OS version seems to be 5.1.6.0. Then I tried to build the docker image following the instructions here (https://github.com/NVIDIA/DL4AGX/blob/master/docker/README.md), but got some error. Please see the context below as my console output.

…DL4AGX$ docker build -t nvidia/drive_os_pdk:5.1.6.0-linux -f docker/DRIVE/Dockerfile.aarch64-linux.5.1.6.0 docker/DRIVE
Sending build context to Docker daemon 33.54GB
Step 1/53 : FROM nvidia/cuda:10.1-devel-ubuntu18.04
—> 9e47e9dfcb9a
Step 2/53 : ARG pdk_version=5.1.6.0
—> Using cache
—> 69b0661391e3
Step 3/53 : ARG base_os_version=1804
—> Using cache
—> a4a95411f651
Step 4/53 : ENV CUDA_VERSION=10.2
—> Using cache
—> 928f3875b6fc
Step 5/53 : ARG cuda_version_dash=10-2
—> Using cache
—> b6be2dd2df84
Step 6/53 : ARG cuda_version_long=10.2.19
—> Using cache
—> b8d01f5335f7
Step 7/53 : ARG driver_version=430.17
—> Using cache
—> 7bc651b42dd7
Step 8/53 : ARG cudnn_version=7.5
—> Using cache
—> ff762d609e54
Step 9/53 : ARG cudnn_version_long=7.5.1.14
—> Using cache
—> 52f2d249aa7d
Step 10/53 : ARG trt_version=5.1
—> Using cache
—> 9840d212d0c6
Step 11/53 : ARG trt_version_short=5.1.4
—> Using cache
—> aae8d0b3da78
Step 12/53 : ARG trt_version_long=5.1.4.2
—> Using cache
—> bc4eea42f927
Step 13/53 : ARG target_driver=10.2-r430
—> Using cache
—> b89096a8b4fa
Step 14/53 : ARG cuda_repo_x86_64=cuda-repo-ubuntu${base_os_version}-${cuda_version_dash}-local-${cuda_version_long}-${driver_version}_1.0-1_amd64.deb
—> Using cache
—> a5551aec80cb
Step 15/53 : ARG cuda_repo_cross_aarch64_linux=cuda-repo-cross-aarch64-${cuda_version_dash}-local-${cuda_version_long}1.0-1_all.deb
—> Using cache
—> 6c1cfd97af04
Step 16/53 : ENV CUDNN_x86_64_DEBS="libcudnn7
${cudnn_version_long}-1+cuda${CUDA_VERSION}amd64.deb libcudnn7-dev${cudnn_version_long}-1+cuda${CUDA_VERSION}amd64.deb"
—> Using cache
—> f518e05c20ce
Step 17/53 : ENV CUDNN_AARCH64_LINUX_DEBS="libcudnn7-cross-aarch64
${cudnn_version_long}-1+cuda${CUDA_VERSION}all.deb libcudnn7-dev-cross-aarch64${cudnn_version_long}-1+cuda${CUDA_VERSION}all.deb"
—> Using cache
—> 81ca36fd0c26
Step 18/53 : ENV tensorrt_repo_x86_64=“nv-tensorrt-repo-ubuntu${base_os_version}-cuda${CUDA_VERSION}-trt${trt_version_long}-ga-20190506_1-1_amd64.deb”
—> Using cache
—> 0c3f98a5e202
Step 19/53 : ENV TENSORRT_x86_64_DEBS="libnvinfer5
${trt_version_short}-1+cuda${CUDA_VERSION}amd64.deb libnvinfer-dev${trt_version_short}-1+cuda${CUDA_VERSION}amd64.deb python-libnvinfer${trt_version_short}-1+cuda${CUDA_VERSION}amd64.deb python3-libnvinfer${trt_version_short}-1+cuda${CUDA_VERSION}amd64.deb uff-converter-tf${trt_version_short}-1+cuda${CUDA_VERSION}amd64.deb graphsurgeon-tf${trt_version_short}-1+cuda${CUDA_VERSION}amd64.deb"
—> Using cache
—> 9351fd961890
Step 20/53 : ENV TRT_AARCH64_LINUX_DEBS="libnvinfer5-cross-aarch64
${trt_version_short}-1+cuda${CUDA_VERSION}all.deb libnvinfer-dev-cross-aarch64${trt_version_short}-1+cuda${CUDA_VERSION}_all.deb"
—> Using cache
—> fbe4f8f0c8a8
Step 21/53 : RUN apt-get update && apt-get upgrade -y && rm -rf /var/lib/apt/lists/*
—> Using cache
—> f32cf668be15
Step 22/53 : RUN apt-get update && apt-get install -y --no-install-recommends libtool rsync pkg-config python python-dev python3 python3-dev x264 v4l-utils gcc-aarch64-linux-gnu g+±aarch64-linux-gnu libjpeg-dev curl ca-certificates wget unzip git nasm pkg-config dh-autoreconf make g++ libboost-all-dev unzip && rm -rf /var/lib/apt/lists/*
—> Using cache
—> 4e5baee5fe81
Step 23/53 : RUN curl -O https://bootstrap.pypa.io/get-pip.py && python get-pip.py && rm get-pip.py
—> Using cache
—> 697f5de263c0
Step 24/53 : RUN curl -O https://bootstrap.pypa.io/get-pip.py && python3 get-pip.py && rm get-pip.py
—> Using cache
—> 71122024a5b4
Step 25/53 : RUN pip install --upgrade --no-cache-dir numpy
—> Using cache
—> 317d69b98c74
Step 26/53 : RUN pip install --upgrade --no-cache-dir pillow pip protobuf pycuda setuptools
—> Using cache
—> 5a81e0fc78d8
Step 27/53 : RUN pip3 install --upgrade --no-cache-dir numpy
—> Using cache
—> 2f95552da1ab
Step 28/53 : RUN pip3 install --upgrade --no-cache-dir pillow pip protobuf pycuda setuptools
—> Using cache
—> 5166b0a1f307
Step 29/53 : COPY pdk_files /pdk_files
—> 30d254601fbb
Step 30/53 : RUN mv /usr/local/cuda /tmp/cuda-backup
—> Running in f49b597336ca
Removing intermediate container f49b597336ca
—> 8e83f3a13202
Step 31/53 : RUN mv /pdk_files/${cuda_repo_x86_64} cuda.deb
—> Running in d8474991ff00
Removing intermediate container d8474991ff00
—> fd826f768c55
Step 32/53 : RUN mv /pdk_files/${cuda_repo_cross_aarch64_linux} cuda-repo-cross-aarch64.deb
—> Running in e3647058f63f
Removing intermediate container e3647058f63f
—> 014c2c6e7571
Step 33/53 : ENV REPO_DEBS=“cuda.deb cuda-repo-cross-aarch64.deb”
—> Running in 29f8114103ab
Removing intermediate container 29f8114103ab
—> 3dbf7c9fb7dc
Step 34/53 : RUN dpkg -i $REPO_DEBS
—> Running in ae35ba559584
Selecting previously unselected package cuda-repo-ubuntu1804-10-2-local-10.2.19-430.17.
(Reading database … 38096 files and directories currently installed.)
Preparing to unpack cuda.deb …
Unpacking cuda-repo-ubuntu1804-10-2-local-10.2.19-430.17 (1.0-1) …
Selecting previously unselected package cuda-repo-cross-aarch64-10-2-local-10.2.19.
Preparing to unpack cuda-repo-cross-aarch64.deb …
Unpacking cuda-repo-cross-aarch64-10-2-local-10.2.19 (1.0-1) …
Setting up cuda-repo-ubuntu1804-10-2-local-10.2.19-430.17 (1.0-1) …
Setting up cuda-repo-cross-aarch64-10-2-local-10.2.19 (1.0-1) …
Removing intermediate container ae35ba559584
—> 5b10f8296a60
Step 35/53 : ENV CUDA_PACKAGES=“nvrtc nvgraph cusolver cufft curand cusparse npp nvjpeg cudart cupti compiler misc-headers command-line-tools nvrtc-dev nvml-dev nvgraph-dev cusolver-dev cufft-dev curand-dev cusparse-dev npp-dev nvjpeg-dev cudart-dev driver-dev nvcc toolkit libraries-dev tools visual-tools”
—> Running in 90e2a87a5d14
Removing intermediate container 90e2a87a5d14
—> 44094e739cb0
Step 36/53 : RUN echo “for i in $CUDA_PACKAGES; do echo "cuda-$i-${cuda_version_dash}=${cuda_version_long}-1";done” | bash > /tmp/cuda-packages.txt
—> Running in 65bced154d2c
Removing intermediate container 65bced154d2c
—> 7af17a2878ac
Step 37/53 : RUN apt-get update && apt-get -o Dpkg::Options::=“–force-overwrite” install -y $(cat /tmp/cuda-packages.txt) --reinstall --allow-downgrades && apt-get install -y libcublas-dev --reinstall --allow-downgrades && apt-mark hold $(cat /tmp/cuda-packages.txt) && rm -rf /var/lib/apt/lists/* && rm -rf /tmp/cuda-packages.txt
—> Running in 15324d4fa08b
Get:1 file:/var/cuda-repo-10-2-local-10.2.19-430.17 InRelease
Ign:1 file:/var/cuda-repo-10-2-local-10.2.19-430.17 InRelease
Get:2 file:/var/cuda-repo-10-2-local-10.2.19-cross-aarch64 InRelease
Ign:2 file:/var/cuda-repo-10-2-local-10.2.19-cross-aarch64 InRelease
Get:3 file:/var/cuda-repo-10-2-local-10.2.19-430.17 Release [574 B]
Get:4 file:/var/cuda-repo-10-2-local-10.2.19-cross-aarch64 Release [574 B]
Get:3 file:/var/cuda-repo-10-2-local-10.2.19-430.17 Release [574 B]
Get:4 file:/var/cuda-repo-10-2-local-10.2.19-cross-aarch64 Release [574 B]
Get:5 file:/var/cuda-repo-10-2-local-10.2.19-430.17 Release.gpg [833 B]
Get:5 file:/var/cuda-repo-10-2-local-10.2.19-430.17 Release.gpg [833 B]
Get:6 file:/var/cuda-repo-10-2-local-10.2.19-cross-aarch64 Release.gpg [819 B]
Get:6 file:/var/cuda-repo-10-2-local-10.2.19-cross-aarch64 Release.gpg [819 B]
Get:7 file:/var/cuda-repo-10-2-local-10.2.19-430.17 Packages [24.4 kB]
Ign:8 Index of /compute/cuda/repos/ubuntu1804/x86_64 InRelease
Ign:9 Index of /compute/machine-learning/repos/ubuntu1804/x86_64 InRelease
Get:10 Index of /compute/cuda/repos/ubuntu1804/x86_64 Release [564 B]
Get:11 Index of /ubuntu bionic InRelease [242 kB]
Get:12 Index of /ubuntu bionic-security InRelease [88.7 kB]
Get:13 Index of /compute/machine-learning/repos/ubuntu1804/x86_64 Release [564 B]
Get:14 file:/var/cuda-repo-10-2-local-10.2.19-cross-aarch64 Packages [4082 B]
Get:15 Index of /compute/cuda/repos/ubuntu1804/x86_64 Release.gpg [819 B]
Get:16 Index of /compute/machine-learning/repos/ubuntu1804/x86_64 Release.gpg [833 B]
Get:17 Index of /compute/cuda/repos/ubuntu1804/x86_64 Packages [141 kB]
Get:18 Index of /compute/machine-learning/repos/ubuntu1804/x86_64 Packages [31.7 kB]
Get:19 Index of /ubuntu bionic-updates InRelease [88.7 kB]
Get:20 Index of /ubuntu bionic-security/multiverse amd64 Packages [8505 B]
Get:21 Index of /ubuntu bionic-security/universe amd64 Packages [844 kB]
Get:22 Index of /ubuntu bionic-backports InRelease [74.6 kB]
Get:23 Index of /ubuntu bionic/restricted amd64 Packages [13.5 kB]
Get:24 Index of /ubuntu bionic/universe amd64 Packages [11.3 MB]
Get:25 Index of /ubuntu bionic-security/restricted amd64 Packages [52.4 kB]
Get:26 Index of /ubuntu bionic-security/main amd64 Packages [908 kB]
Get:27 Index of /ubuntu bionic/multiverse amd64 Packages [186 kB]
Get:28 Index of /ubuntu bionic/main amd64 Packages [1344 kB]
Get:29 Index of /ubuntu bionic-updates/universe amd64 Packages [1376 kB]
Get:30 Index of /ubuntu bionic-updates/main amd64 Packages [1205 kB]
Get:31 Index of /ubuntu bionic-updates/multiverse amd64 Packages [19.8 kB]
Get:32 Index of /ubuntu bionic-updates/restricted amd64 Packages [66.6 kB]
Get:33 Index of /ubuntu bionic-backports/main amd64 Packages [8286 B]
Get:34 Index of /ubuntu bionic-backports/universe amd64 Packages [8158 B]
Fetched 18.1 MB in 4s (4494 kB/s)
Reading package lists…
Reading package lists…
Building dependency tree…
Reading state information…
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
cuda-command-line-tools-10-2 : Depends: cuda-cupti-dev-10-2 (>= 10.2.19) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
The command ‘/bin/sh -c apt-get update && apt-get -o Dpkg::Options::=“–force-overwrite” install -y $(cat /tmp/cuda-packages.txt) --reinstall --allow-downgrades && apt-get install -y libcublas-dev --reinstall --allow-downgrades && apt-mark hold $(cat /tmp/cuda-packages.txt) && rm -rf /var/lib/apt/lists/* && rm -rf /tmp/cuda-packages.txt’ returned a non-zero code: 100

Seems like the docker file is not configured correctly, any suggestions?

Hello @James22,

it seems from the log you have attached that there is a dependency that cannot be satisfied and therefor the rest of the installation failed. the cuda-cupti-dev-10-2 is needed and for some reason was not selected to be installed:

please try and add the string cupti-dev to environment variable CUDA_PACKAGES at line 131 of this docker file docker/DRIVE/Dockerfile.aarch64-linux.5.1.6.0.

you may add it right after cupti, for example:
ENV CUDA_PACKAGES=“nvrtc nvgraph cusolver cufft curand cusparse npp nvjpeg cudart cupti cupti-dev compiler misc-headers command-line-tools nvrtc-dev nvml-dev nvgraph-dev cusolver-dev cufft-dev curand-dev cusparse-dev npp-dev nvjpeg-dev cudart-dev driver-dev nvcc toolkit libraries-dev tools visual-tools”

then please try again.

Thank you @shayNV,

This solved the current issue, but now I’ve got another one:


Get:6 file:/var/nv-tensorrt-repo-cuda10.2-trt5.1.4.2-ga-20190506 uff-converter-tf 5.1.4-1+cuda10.2 [37.5 kB]
Get:7 file:/var/nv-tensorrt-repo-cuda10.2-trt5.1.4.2-ga-20190506 libcudnn7-dev 7.5.1.14-1+cuda10.2 [148 MB]
Get:8 file:/var/nv-tensorrt-repo-cuda10.2-trt5.1.4.2-ga-20190506 libnvinfer-dev 5.1.4-1+cuda10.2 [47.8 MB]
Fetched 714 kB in 4s (186 kB/s)
W: Download is performed unsandboxed as root as file ‘/pdk_files/graphsurgeon-tf_5.1.4-1+cuda10.2_amd64.deb’ couldn’t be accessed by user ‘_apt’. - pkgAcquire::Run (13: Permission denied)
Selecting previously unselected package libcudnn7.
(Reading database … 55526 files and directories currently installed.)
Preparing to unpack libcudnn7_7.5.1.14-1+cuda10.2_amd64.deb …
Unpacking libcudnn7 (7.5.1.14-1+cuda10.2) …
Selecting previously unselected package libcudnn7-dev.
Preparing to unpack libcudnn7-dev_7.5.1.14-1+cuda10.2_amd64.deb …
Unpacking libcudnn7-dev (7.5.1.14-1+cuda10.2) …
Setting up libcudnn7 (7.5.1.14-1+cuda10.2) …
Setting up libcudnn7-dev (7.5.1.14-1+cuda10.2) …
update-alternatives: using /usr/include/x86_64-linux-gnu/cudnn_v7.h to provide /usr/include/cudnn.h (libcudnn) in auto mode
Processing triggers for libc-bin (2.27-3ubuntu1) …
dpkg: error: cannot access archive ‘python-libnvinfer_5.1.4-1+cuda10.2_amd64.deb’: No such file or directory
The command ‘/bin/sh -c cd /pdk_files && dpkg -i ${tensorrt_repo_x86_64} && rm -rf ${tensorrt_repo_x86_64} && apt-get update && apt-get download libcudnn7=${cudnn_version_long}-1+cuda${CUDA_VERSION} libcudnn7-dev=${cudnn_version_long}-1+cuda${CUDA_VERSION} libnvinfer5=${trt_version_short}-1+cuda${CUDA_VERSION} libnvinfer-dev=${trt_version_short}-1+cuda${CUDA_VERSION} uff-converter-tf graphsurgeon-tf python3-libnvinfer python-libnvinfer && dpkg -i ${CUDNN_x86_64_DEBS} && rm -rf ${CUDNN_x86_64_DEBS} && dpkg -i ${TENSORRT_x86_64_DEBS} && rm -rf ${TENSORR_x86_64_DEBS}’ returned a non-zero code: 2

The whole output is very long, and I’m attaching it here. Console output.log (96.2 KB)
Seems it’s trying to find a package named ‘python-libnvinfer_5.1.4-1+cuda10.2_amd64.deb’, which is not included in the SDKManager downloaded files. In face, some other packages like “python3-libnvinfer…”, “uff-converter…”, “graphsurgeon-tf…” are also not found in the downloaded files. Could you suggest how to fix?

Hello @James22,
it seems from the log that the container have downloaded a python-libnvinfer version that is not what expected:

so please try and update the docker file in line 175 and set the specific version needed:
instead of: python3-libnvinfer python-libnvinfer
write: python3-libnvinfer=${trt_version_short}-1+cuda${CUDA_VERSION} python-libnvinfer=${trt_version_short}-1+cuda${CUDA_VERSION}

With these changes, now I build the docker image. However, when I try to compile the DL4AGX project, some packages still seem to be missing.

…DL4AGX$ dazel build //MultiDeviceInferencePipeline/… //plugins/… --config=D5L-toolchain
INFO: Analyzed 32 targets (35 packages loaded, 207 targets configured).
INFO: Found 32 targets…
ERROR: missing input file ‘@turbojpeg_aarch64_linux//:libturbojpeg.so’
ERROR: /home/XXX/.cache/bazel/_bazel_alien/external/turbojpeg_aarch64_linux/BUILD.bazel:4:1: @turbojpeg_aarch64_linux//:turbojpeg: missing input file ‘@turbojpeg_aarch64_linux//:libturbojpeg.so’
ERROR: /home/XXX/.cache/bazel/_bazel_alien/external/turbojpeg_aarch64_linux/BUILD.bazel:4:1 1 input file(s) do not exist
INFO: Elapsed time: 0.586s, Critical Path: 0.01s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

I also got this type of error:

ERROR: missing input file ‘@dali_aarch64_linux//:lib/libdali.so’

It seems at least the aarch64 version of libturbojpeg and dali are missing in the docker image. Could you suggest how to fix?
Thank you!

Hello @James22,

I was able to reproduce the problem on my side.

I am sorry for the inconvenience, it seems there is a problem with the master branch repository and the docker file is not working as expected.

we will work on getting the git repo to be updated soon.

At the meantime I suggest you use the app/RetinaNetDALITRT branch of the git repository as this is working.
please try and use the attached docker file instead of the docker file in the above branch.
Dockerfile.aarch64-linux.5.1.6.0.zip (4.2 KB)

I have tested it on my side to be working for building libtensorrtinferop.so lib.
here are the commands I have executed:
docker build -t nvidia/drive_os_pdk:5.1.6.0-linux -f docker/DRIVE/Dockerfile.aarch64-linux.5.1.6.0 docker/DRIVE
dazel build //plugins/dali/TensorRTInferOp:libtensorrtinferop.so --config=D5L-toolchain