TensorRT Installation and Running Error on AWS EC2 Deep Learning AMI Instance

Description

Hello,

I have a Deep Learning AMI on the AWS EC2 (Deep Learning AMI (Ubuntu 18.04) Version 48.0).

I need to use TensorRT on this.

I set up the ngc settings (API key), then, I pull the TensorRT container (docker pull [nvcr.io/nvidia/tensorrt:20.11-py3])

After that, when I try to run the docker image by this command,

docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/tensorrt:20.11-py3

I got this error

When I try to run docker image another command, I got different error.

What should I do to solve this problem and running the TensorRT on my EC2 instance?

This is kind of emergency problem, please help me as soon as possible.

Thanks

Environment

TensorRT Version: TensorRT 7.2.1
GPU Type: Tesla K80
Nvidia Driver Version: 450.142.00
CUDA Version: container include NVIDIA CUDA 11.1.0
CUDNN Version: container include NVIDIA cuDNN 8.0.4
Operating System + Version: (Ubuntu 18.04) Version 48.0
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 1.15.5
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Steps To Reproduce

docker pull nvcr.io/nvidia/tensorrt:20.11-py3
docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/tensorrt:20.11-py3
docker run nvcr.io/nvidia/tensorrt:20.11-py3

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,
Please refer to the installation steps from the below link if in case you are missing on anything
https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html
Also, we suggest you to use TRT NGC containers to avoid any system dependency related issues.
https://ngc.nvidia.com/catalog/containers/nvidia:tensorrt

Thanks!

Hi,

I have already used to TRT NGC container and my problem is not solved.

I got this error when I try to run the installed TRT conteiner.

According to this page (Container Release Notes :: NVIDIA Deep Learning TensorRT Documentation),

TRT container includes:
-NVIDIA CUDA 11.1.0
-NVIDIA cuDNN 8.0.4
-NVIDIA NCCL 2.8.2

So, in the beginning, I only need Nvidia driver 455 or later.

Since I am using EC2 deep learning AMI instance, this instance is coming with Nvidia driver. I dont need any pre-installation for TensorRT. Right? or do I need extra installation for this purpose?

Thanks

@NVES

hi,

any suggestion for this issue? it is kind of urgent for me

thanks.

Hi @skilic ,
In the error mentioned in the screenshot, i see you are using the command as is

docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/tensorrt:xx.xx-py3

However here you need to replace the local_dir:container_dir with your host dir and mount dir resp.
You need to mount the path o your host machine to the container.
Can you please try that and let us know.

Thanks!

Hello, @AakankshaS,

I tried and I failed.

I am working on AWS EC2 Deep learning AMI.

I couldnt find the host dir and mount dir. How can I get these directory?

By the way,

Here is the output of docker images list.

docker_images

And, when I run this command;

docker run --gpus all nvcr.io/nvidia/tensorrt:20.11-py3

I got this TensorRT information output but couldnt run.

.

I think I am missing some easy part but could not find yet.

Could you help me please?

Thanks

Hi @skilic ,
I believe for the error you are getting, you can just point it to any folder on the server as host dir.