Dino FAN small inference time is high

rishikesan · January 19, 2024, 7:47am

Please provide the following information when requesting support.

• Hardware (T4)
• Network Type (Dino FAN small)
• How to reproduce the issue ?
Converting the .pth weights into onnx (config attached below) and onnx to tensorRT and run the inference inside a docker with T4 GPU in cloud VM, with batch size as 1

When running the inference (for Dino FAN small FP32 ) to process 80 images , it took around 45 Sec, which is very close to the inference time of Dino FAN large FP32

But as mentioned in below NVIDIA docs Dino FAN small is around 2 times faster than Dino FAN large

Which is same for all the other small versions of Dino, than Dino FAN large
Could you please advice why is that and guide me on this

Configuration used for .pth weight to onnx conversion

export:
  gpu_id: 0
  input_width: 960
  input_height: 544
  opset_version: 17
  on_cpu: False
dataset:
  num_classes: 91
  batch_size: -1
model:
  backbone: fan_small
  num_feature_levels: 4
  dec_layers: 6
  enc_layers: 6
  num_queries: 900
  num_select: 100
  dropout_ratio: 0.0
  dim_feedforward: 2048

Configuration used for onnx to tensorRT

gen_trt_engine:
  gpu_id: 0
  input_width: 960
  input_height: 544
  tensorrt:
    data_type: fp32
    workspace_size: 4096
    min_batch_size: 1
    opt_batch_size: 8
    max_batch_size: 8
dataset:
  num_classes: 91
  batch_size: 1
model:
  backbone: fan_small

Morganh · January 19, 2024, 8:53am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Please use trtexec to profile. You can refer to TRTEXEC with DINO - NVIDIA Docs.
For example, run below against the two onnx files.

trtexec --onnx=/path/to/model.onnx \
        --maxShapes=inputs:16x3x544x960 \
        --minShapes=inputs:1x3x544x960 \
        --optShapes=inputs:8x3x544x960 \
        --saveEngine=/path/to/save/trt/model.engine

Then check the log.
There is “compute time” in the log.

system · February 24, 2024, 5:41am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Low FPS on Jetson Nano using TensorRT Jetson Nano tensorrt , tensorflow	7	1179	August 27, 2020
DINO-FAN_base INT8 model is not faster than FP16 model TAO Toolkit tensorrt , deepstream	5	672	November 8, 2023
Darknet YoloV4-tiny model in TensorRT 8 inference TensorRT tensorrt , onnx	7	2143	October 22, 2021
Inference time on jetson nano Jetson AGX Xavier tensorrt , cuda , kernel , jetson-inference	2	921	May 30, 2022
Inference time is linear respective to batch size while using TENSORRT MODEL TensorRT tensorrt , yolo	8	2782	May 5, 2021
Inference time changes after training TensorRT tensorrt	5	576	September 25, 2020
TensorRt inference is taking 1.5 sec to inference a single frame.i want to speed up my inference TensorRT tensorrt , jetson-inference , jetson-nano	1	892	March 13, 2023
Inference result gets worse when converting pytorch model to TensorRT model TensorRT pytorch	6	1057	January 19, 2022
Inference time of engine with dynamic batch size is not good? TensorRT	1	256	November 27, 2023
TRT Engin in INT8 is much slower than FP16 TensorRT	4	1845	November 11, 2021

Dino FAN small inference time is high

Related Topics