I run the code in run-cifar-engine.py (see attached files) to use the engine arch_00000.trt (see attached files) with the python API.
The engine was created from the attached onnx file via:
trtexec --onnx=arch_00000.onnx --saveEngine=arch_00000.trt
Both the build and the inference passed.
Upon execution of the following line I get the error below:
35: context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)
[TensorRT] ERROR: ../rtSafe/cuda/genericReformat.cu (1294) - Cuda Error in executeMemcpy: 1 (invalid argument) [TensorRT] ERROR: FAILED_EXECUTION: std::exception
Things I have unsuccessfully tried:
Try it on a different machine.
Machine 1 is a Container using a V100
Machine 2 is a jetson nano.
I have searched and found the following issue on github:
As far as I can tell my inputs and outputs are correctly sized.
- Does anyone know why this happens and how to fix it ?
- How do I go about debugging this ?
I would be thankful for any advice
TensorRT Version: 7.2.3-1+cuda11.1
GPU Type: Nvidia Tesla V100 32GB
Nvidia Driver Version: 465.27
CUDA Version: 11.3
Operating System + Version: Ubuntu 20.04.2 LTS
Python Version (if applicable): 3.8.5
TensorFlow Version (if applicable):
PyTorch Version (if applicable): Model was exported to onnx from pytorch v1.5.0
Baremetal or Container (if container which image + tag): Container: nvcr.io/nvidia/tensorrt:21.05-py3
Download files and go to the directory
Use docker/podman to start container:
podman run -it --rm -v $(pwd):/workdir -w /workdir nvcr.io/nvidia/tensorrt:21.05-py3 or docker run -it --rm -v $(pwd):/workdir -w /workdir nvcr.io/nvidia/tensorrt:21.05-py3