Description
I run the code in run-cifar-engine.py (see attached files) to use the engine arch_00000.trt (see attached files) with the python API.
The engine was created from the attached onnx file via:
trtexec --onnx=arch_00000.onnx --saveEngine=arch_00000.trt
Both the build and the inference passed.
Upon execution of the following line I get the error below:
35: context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)
[TensorRT] ERROR: ../rtSafe/cuda/genericReformat.cu (1294) - Cuda Error in executeMemcpy: 1 (invalid argument)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception
Things I have unsuccessfully tried:
-
Try it on a different machine.
Machine 1 is a Container using a V100
Machine 2 is a jetson nano. -
I have searched and found the following issue on github:
https://github.com/NVIDIA/TensorRT/issues/421
As far as I can tell my inputs and outputs are correctly sized.
Questions:
- Does anyone know why this happens and how to fix it ?
Alternatively: - How do I go about debugging this ?
I would be thankful for any advice
Environment
TensorRT Version: 7.2.3-1+cuda11.1
GPU Type: Nvidia Tesla V100 32GB
Nvidia Driver Version: 465.27
CUDA Version: 11.3
CUDNN Version:
Operating System + Version: Ubuntu 20.04.2 LTS
Python Version (if applicable): 3.8.5
TensorFlow Version (if applicable):
PyTorch Version (if applicable): Model was exported to onnx from pytorch v1.5.0
Baremetal or Container (if container which image + tag): Container: nvcr.io/nvidia/tensorrt:21.05-py3
Relevant Files
https://drive.google.com/drive/folders/1hUyXW3nWyH8cEodsqB0LpmmqYAjFVS7u?usp=sharing
Steps To Reproduce
-
Download files and go to the directory
-
Use docker/podman to start container:
podman run -it --rm -v $(pwd):/workdir -w /workdir nvcr.io/nvidia/tensorrt:21.05-py3
or
docker run -it --rm -v $(pwd):/workdir -w /workdir nvcr.io/nvidia/tensorrt:21.05-py3
python3 run-cifar-engine.py