Docker issue

Hi all,

I am a beginner of docker that I am not familiar with it. I wanna try to use the docker to build tensorrt repository on my AGX device, and I followed the steps from TRT GitHub.

Here are my steps below:

git clone -b master https://github.com/nvidia/TensorRT TensorRT
cd TensorRT
git submodule update --init --recursive
export TRT_SOURCE=`pwd`

sudo ./docker/build.sh --file docker/ubuntu-cross-aarch64.Dockerfile --tag tensorrt-ubuntu-jetpack --os 18.04 --cuda 10.2

Here is my output error.

Building container:
> docker build -f docker/ubuntu-cross-aarch64.Dockerfile --build-arg OS_VERSION=18.04 --build-arg CUDA_VERSION=10.2  --build-arg uid=0 --build-arg gid=0 --tag=tensorrt-ubuntu-jetpack .
Sending build context to Docker daemon  7.137MB
Step 1/32 : ARG CUDA_VERSION=10.2
Step 2/32 : ARG OS_VERSION=18.04
Step 3/32 : ARG NVCR_SUFFIX=
Step 4/32 : FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu${OS_VERSION}${NVCR_SUFFIX}
10.2-devel-ubuntu18.04: Pulling from nvidia/cuda
d7c3167c320d: Pull complete 
131f805ec7fd: Pull complete 
322ed380e680: Pull complete 
6ac240b13098: Pull complete 
006053f4160f: Pull complete 
7f180dbff330: Pull complete 
36adb6171cb0: Pull complete 
d7cf778701d8: Pull complete 
19ebebfc9562: Pull complete 
335ede31fbe9: Pull complete 
0d6c72ebfc16: Pull complete 
Digest: sha256:66981475ea20f9763ad04386d18a2316ff9e2ace393a75cdf4b1ff1b931e1ffc
Status: Downloaded newer image for nvidia/cuda:10.2-devel-ubuntu18.04
 ---> 930b6e8bb78c
Step 5/32 : LABEL maintainer="NVIDIA CORPORATION" 
 ---> Running in ba48e1b24c96
Removing intermediate container ba48e1b24c96
 ---> 4b7ec8376182
Step 6/32 : ARG uid=1000
 ---> Running in 22ae45f7d788
Removing intermediate container 22ae45f7d788
 ---> 1b6e693fb3e1
Step 7/32 : ARG gid=1000
 ---> Running in 9c046d82a5ed
Removing intermediate container 9c046d82a5ed
 ---> 3b1de79b19cf
Step 8/32 : RUN groupadd -r -f -g ${gid} trtuser && useradd -r -u ${uid} -g ${gid} -ms /bin/bash trtuser
 ---> Running in 4a09b4e77709
standard_init_linux.go:211: exec user process caused "exec format error" 
The command '/bin/sh -c groupadd -r -f -g ${gid} trtuser && useradd -r -u ${uid} -g ${gid} -ms /bin/bash trtuser' returned a non-zero code: 1

Is there any idea about this?

I can do everything very well on local device if I don’t use docker to run TRT.
My goal is to use Docker to implement TRT on AGX device.

I have second question that should I still need to install nvidia-docker? (I know it already existed docker after we flashed it by SDK Manager.)

Thank you in advance.

Sincerely,
Chieh


Environment

TensorRT Version : 7.1 with Jetpack 4.4
GPU Type : (Jetson AGX Xavier)
Nvidia Driver Version : Jetpack 4.4
CUDA Version : 10.2
CUDNN Version : 8.0
Operating System + Version : Jetpack 4.4 (Costimized Ubuntu 18.04)
Python Version (if applicable) : 3.6
PyTorch Version (if applicable) : 1.5.0
cmake version : 3.13.0
opencv version : 4.1.1

Hi,

TensorRT is a cuda-related library and you will need to handle GPU driver when building it within the docker.
A simplest way is to use our package with TensorRT installed directly.

Please let us know if this cannot meet your requirement.
Thanks.

Thanks for your adivce!
I already tested it, and it can work. I will try to build the repository in this container.

BTW, I wonder is there any different or particular points between nvidia-l4t-ml and nvidia:deepstream-l4t? Actually I am not sure which one I should choose…

Thanks.

I already tested it and successfully implemented TensorRT inference by NVIDIA-docker.
Thank you so much.

Good to know this.

There are some different libraries installed on the nvidia-l4t-ml and nvidia:deepstream-l4t.
You can check the description on the NGC for details.

In general, please use nvidia:deepstream-l4t if you will use our Deepstream SDK.
Or you can use nvidia-l4t-ml since there are several popular ML libraries installed.

Thanks.

1 Like