This is what I do not understand. Why the output does not have the same output size? I think this is the error but occurs when the engine is generated. When I export the onnx model the output layer is predictions/Softmax.
You can refer to TRTEXEC with Classification TF1/TF2/PyT - NVIDIA Docs to generate engine in your current device. Then let deepstream not generate engine. Let deepstream use this engine.
I get the same output when the engine is loaded:
INFO: [FullDims Engine Info]: layers num: 2
0 INPUT kFLOAT input_1 3x224x224 min: 1x3x224x224 opt: 2x3x224x224 Max: 64x3x224x224
1 OUTPUT kFLOAT predictions 35 min: 0 opt: 0 Max: 0
Also, the inference output the same labels as before (wrong label) and practically the same label all times.
Did you ever save the log for exporting? Looks like your onnx does not match tao_tensorflow1_backend/nvidia_tao_tf1/cv/makenet/export/classification_exporter.py at main · NVIDIA/tao_tensorflow1_backend · GitHub.
Yes, and it have at line 21:
Using output nodes: ['predictions/Softmax']
log.txt (4.2 KB)
The problem is generating the engine that take output layer as prediction.
I export the pretrained model to onnx and then I generate the engine and I get the same error, the tensorrt engine take predictions as output. So, there is some error in the pipeline that do the generated engine does not work.
This output name is expected due to tao_tensorflow1_backend/nvidia_tao_tf1/core/export/_onnx.py at 2ec95cbbe0d74d6a180ea6e989f64d2d97d97712 · NVIDIA/tao_tensorflow1_backend · GitHub.
For tensorrt engine, you can run trtexec as below. For example,
$ trtexec --onnx=forum_303165.onnx --maxShapes=input_1:1x3x224x224 --minShapes=input_1:1x3x224x224 --optShapes=input_1:1x3x224x224 --saveEngine=forum_303165.engine --workspace=20480
You can check the engine with polygraphy. For example,
polygraphy inspect model forum_303165.engine
[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[I] Loading bytes from /localhome/local-morganh/bak_x11-0002/tf1_forum_298861/forum_303165.engine
[I] ==== TensorRT Engine ====
Name: Unnamed Network 0 | Explicit Batch Engine
---- 1 Engine Input(s) ----
{input_1 [dtype=float32, shape=(1, 3, 224, 224)]}
---- 1 Engine Output(s) ----
{predictions [dtype=float32, shape=(1, 2)]}
---- Memory ----
Device Memory: 136225280 bytes
---- 1 Profile(s) (2 Tensor(s) Each) ----
- Profile: 0
Tensor: input_1 (Input), Index: 0 | Shapes: min=(1, 3, 224, 224), opt=(1, 3, 224, 224), max=(1, 3, 224, 224)
Tensor: predictions (Output), Index: 1 | Shape: (1, 2)
---- 30 Layer(s) ----
For accuracy issue, it should be the same as Inference with tensorrt engine file has different results compared with trained hdf5 model. As a workaround, you can train a model in nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5 and export to rerun.
After investigating, I get the bug. When I exported the model to Onnx using tao export it created the nvinfer_config file with the offsets and the net-scale-factor. I was using torch preprocess to train my net in TAO but the TAO export return the preprocess for caffe type. So after do some math, I modified:
net-scale-factor: 0.017507
offsets: 123,675;116,28;103,53
Now it works good and get the same results as TAO.
OK, thanks for the info.