Environment
TensorRT Version: 7.2.1
NVIDIA GPU: T4
NVIDIA Driver Version: 450.51.06
CUDA Version: 11.1
CUDNN Version: 8.0.4
Operating System: Ubuntu 18.04
Python Version (if applicable): 1.8
Tensorflow Version (if applicable):
PyTorch Version (if applicable): container image nvcr.io/nvidia/pytorch:20.11-py3
Baremetal or Container (if so, version): container image deepstream:5.1-21.02-triton
In regards to the === Build and Inference Batch Options ===
in trtexec. What options should I use to build the engine with dynamic input shapes, so it can be deployed later on with DS-Triton with Bs >1?. Right now I am getting the error TensorRT engine only supports max-batch 1
with DS
See below:
Build the engine with dymanic shapes:
$ /usr/src/tensorrt/bin/trtexec --onnx=yolov4_-1_3_608_608_dynamic.onnx --explicitBatch --minShapes=\'input\':1x3x608x608 --optShapes=\'input\':4x3x608x608 --maxShapes=\'input\':8x3x608x608 --workspace=4096 --saveEngine=yolov4_-1_3_608_608_dynamic_int8_.engine --int8
Run the inference with trtexec and default batch size
$ /usr/src/tensorrt/bin/trtexec --loadEngine=yolov4_-1_3_608_608_dynamic_onnx_int8_trtexec_4.engine --int8
Result:
[03/16/2021-00:23:14] [I] Host Latency
[03/16/2021-00:23:14] [I] min: 7.01904 ms (end to end 12.0889 ms)
[03/16/2021-00:23:14] [I] max: 7.89343 ms (end to end 13.8339 ms)
[03/16/2021-00:23:14] [I] mean: 7.15021 ms (end to end 12.3533 ms)
[03/16/2021-00:23:14] [I] median: 7.09982 ms (end to end 12.2517 ms)
[03/16/2021-00:23:14] [I] percentile: 7.88986 ms at 99% (end to end 13.818 ms at 99%)
[03/16/2021-00:23:14] [I] throughput: 160.912 qps
[03/16/2021-00:23:14] [I] walltime: 3.02029 s
[03/16/2021-00:23:14] [I] Enqueue Time
[03/16/2021-00:23:14] [I] min: 1.4646 ms
[03/16/2021-00:23:14] [I] max: 1.79004 ms
[03/16/2021-00:23:14] [I] median: 1.48828 ms
[03/16/2021-00:23:14] [I] GPU Compute
[03/16/2021-00:23:14] [I] min: 6.0675 ms
[03/16/2021-00:23:14] [I] max: 6.93729 ms
[03/16/2021-00:23:14] [I] mean: 6.1978 ms
[03/16/2021-00:23:14] [I] median: 6.14783 ms
[03/16/2021-00:23:14] [I] percentile: 6.9351 ms at 99%
[03/16/2021-00:23:14] [I] total compute time: 3.01213 s
Print the engine’s input and output shapes:
input shape : (-1, 3, 608, 608)
out shape : (-1, 22743, 1, 4)
Deploy the engine with DS
Run inference on DS with max_batch_size=1
$ deepstream-app -c source1_primary_yolov4.txt
I0316 01:15:35.232182 159 model_repository_manager.cc:810] loading: yolov4_nvidia:1
I0316 01:15:46.895954 159 plan_backend.cc:333] Creating instance yolov4_nvidia_0_0_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic_onnx_int8_trtexec_4.engine
I0316 01:15:47.333165 159 plan_backend.cc:666] Created instance yolov4_nvidia_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0316 01:15:47.334265 159 model_repository_manager.cc:983] successfully loaded 'yolov4_nvidia' version 1
INFO: infer_trtis_backend.cpp:206 TrtISBackend id:1 initialized model: yolov4_nvidia
Runtime commands:
h: Print this help
q: Quit
p: Pause
r: Resume
NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
To go back to the tiled display, right-click anywhere on the window.
**PERF: FPS 0 (Avg)
**PERF: 0.00 (0.00)
** INFO: <bus_callback:181>: Pipeline ready
** INFO: <bus_callback:167>: Pipeline running
**PERF: 138.28 (138.17)
**PERF: 141.00 (139.60)
** INFO: <bus_callback:204>: Received EOS. Exiting ...
Quitting
I0316 01:16:00.336337 159 model_repository_manager.cc:837] unloading: yolov4_nvidia:1
I0316 01:16:00.338973 159 server.cc:280] Waiting for in-flight requests to complete.
I0316 01:16:00.338986 159 server.cc:295] Timeout 30: Found 1 live models and 0 in-flight non-inference requests
I0316 01:16:00.378079 159 model_repository_manager.cc:966] successfully unloaded 'yolov4_nvidia' version 1
I0316 01:16:01.339052 159 server.cc:295] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
App run successful
Run inference on DS with max_batch_size=4
$ deepstream-app -c source1_primary_yolov4.txt
Error:
E0316 01:20:12.879238 195 model_repository_manager.cc:1705] unable to autofill for 'yolov4_nvidia', configuration specified max-batch 4 but TensorRT engine only supports max-batch 1
ERROR: infer_trtis_server.cpp:1044 Triton: failed to load model yolov4_nvidia, triton_err_str:Internal, err_msg:failed to load 'yolov4_nvidia', no version is available
ERROR: infer_trtis_backend.cpp:45 failed to load model: yolov4_nvidia, nvinfer error:NVDSINFER_TRTIS_ERROR
ERROR: infer_trtis_backend.cpp:184 failed to initialize backend while ensuring model:yolov4_nvidia ready, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:14.484600140 195 0x56007e1c7cf0 ERROR nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in createNNBackend() <infer_trtis_context.cpp:246> [UID = 1]: failed to initialize trtis backend for model:yolov4_nvidia, nvinfer error:NVDSINFER_TRTIS_ERROR
I0316 01:20:12.879481 195 server.cc:280] Waiting for in-flight requests to complete.
I0316 01:20:12.879488 195 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
0:00:14.484704250 195 0x56007e1c7cf0 ERROR nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in initialize() <infer_base_context.cpp:81> [UID = 1]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:14.484716684 195 0x56007e1c7cf0 WARN nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Failed to initialize InferTrtIsContext
0:00:14.484722696 195 0x56007e1c7cf0 WARN nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Config file path: /workspace/Deepstream_5.1_Triton/samples/configs/deepstream-app-trtis/config_infer_primary_yolov4.txt
0:00:14.485106084 195 0x56007e1c7cf0 WARN nvinferserver gstnvinferserver.cpp:460:gst_nvinfer_server_start:<primary_gie> error: gstnvinferserver_impl start failed
** ERROR: <main:655>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to initialize InferTrtIsContext
Debug info: gstnvinferserver_impl.cpp(439): start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie:
Config file path: /workspace/Deepstream_5.1_Triton/samples/configs/deepstream-app-trtis/config_infer_primary_yolov4.txt
ERROR from primary_gie: gstnvinferserver_impl start failed
Debug info: gstnvinferserver.cpp(460): gst_nvinfer_server_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
App run failed