My problem with loadInputs in trtexec

hank.fang.usa · July 12, 2023, 9:18pm

I am trying to use my specific test image (for example: checkerboard.png) as inference input to trtexec, I followed with instruction here: About --loadInputs in trtexec - #2 by NVES

convert my input file (checkerboard.png) to be binary data (checkerboard.dat), the size is 1X3X1500X1500=67500000 Bytes
then if I select the GPU to run my trtexec: trtexec …–loadInputs=input:0:checkerboard.dat…, it runs without problem
but if I select the GPU+DLA to run my trtexec: trtexec --useDLACore=0 --loadInputs=input:0:checkerboard.dat… , it fails with message to open my checkerboard.dat “Note: Expected: 72000000 bytes but only read: 67500000 bytes”

How can the same input data (checkerboard.dat) works for running on GPU, but fails for running on GPU+DLA?

AastaLLL · July 13, 2023, 3:10am

Hi,

We need to reproduce this issue to know more about the error comes from.
Could you share a reproducible sample with us so we can check it with our internal team?

Thanks.

hank.fang.usa · July 13, 2023, 3:43pm

sure, here is how you can reproduce:

take the simple model (demo-bs1.onnx)
demo-bs1.onnx (1.5 KB)
take the test image (checkerboard.png) and it’s binary file (checkerboard.dat)
checkerboard.dat (64.4 MB)

checkerboard1500×1500 10.9 KB
follow instructions in GitHub - NVIDIA-AI-IOT/jetson_benchmarks: Jetson Benchmark to profile this model on AGX orin, using options: --precision int8 --loadInputs=input_18:0:checkboard.dat (you will have to make minor change to the file under jetson_benchmarks/utils/load_store_engine.py a little bit, basically, appending the engine_CMD with some string which can take the “–loadInputs” option.

After that, you should be able to run without problem with GPU in “benchmark_csv/orin-benchmarks.csv”
demo, onnx,1,1,0,2048,0,NA,NA

but you will hit with the problem if GPU+2DLA is used:
demo, onnx,3,1,0,2048,0,NA,NA

AastaLLL · July 21, 2023, 6:20am

Hi,

Have you tried to use the trtexec binary directly?
Could you try to feed the same dat file as input to trtexec to see if it works?

Thanks.

hank.fang.usa · July 21, 2023, 9:47pm

Yes, I have tried with trtexec binaary directly, exactly the same problem.

hank.fang.usa · July 26, 2023, 7:02pm

I checked more on this today and here is what I found:

I tested with trtexec binary directly, my simple model is onnx with float32 input data type, for example, if I want to pass in a test image (3, 200x200) to trtexec by option (–loadInputs), I have to convert my test image (3, 200X200) into some binary data first, either it ends up in 120000 Bytes (int8) or 480000 (float32)
if I choose to run the model on GPU only (with precision int8), it says it expect input in 120000 (1 x 3 x 200 x 200) bytes. No issue here.
If I choose to run the model on GPU+DLA, the error log complains it expect the input data to be 1280000 bytes, so it seems 32 x 200 x 200 is used

I tested other images with various resolutions, the issue is the same, so the question here is how this major “32” number is provided and why it’s different expected test image size for GPU vs GPU+DLA?

AastaLLL · August 10, 2023, 5:57am

Hi,

There are some constraints based on the data type you configured.
For example, C must be padded to the next 32-byte for kCHW16 and kCHW32 formats.

More details, please check below doc:

Thanks.

system · August 30, 2023, 1:12am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorRT inference problem Jetson AGX Xavier	8	953	October 18, 2021
Wrong result from DLA Jetson AGX Xavier nvbugs , dla	8	842	October 18, 2021
Observing different output when running layer on DLA vs running on GPU Jetson AGX Xavier dla	5	1060	June 25, 2021
Trtexec fails with null pointer exception when useDLACore enabled TensorRT dla	16	1141	November 28, 2023
Trtexec failed to generate engine (Internal Error) with DLA Jetson Orin NX tensorrt , nvbugs , dla	7	962	April 8, 2024
Error Code 7: Internal Error (Add_430: dimensions not compatible for elementwise. Condition '==' violated: 5376 != 4725 TensorRT tensorrt , cuda , ubuntu	4	728	September 21, 2023
Resize incompatibility when generating a full TensorRT engine for DLA Jetson Orin NX tensorrt , dla	8	699	February 23, 2023
How to boost trtexec's gps for 1DLA only? Jetson AGX Orin jetson-inference , dla	16	1121	April 26, 2023
GPU vs CPU deep learning memory usage Jetson Nano cudnn	5	653	March 26, 2024
Using trtexec fails to convert onnx to tensorrt engine (DLAcore) FP16, but int8 works Jetson Xavier NX dla	7	1253	August 10, 2022

My problem with loadInputs in trtexec

Related topics