Clarifications needed on tao-converter

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc): GeForce 3090
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) : Yolo_v4
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here): 3.22.05

I have a few questions related to using tao-converter for building tensorrt engine from exported model file:

In model export, I see the following options:

(source)

In tao-converter, I see the following options:

(source)

  1. What is the role of -m / --max_batch_size? Are they only relevant in int8 mode?
  2. The -b value in tao-converter should be equal to the --batch_size value during calibration file export right?
  3. When setting -p for optimization profiles, if my app always has X input sources, should I set the <n> value (in <n>x<c>x<h>x<w>) of<opt_shape> to X or should I have identical profile for min/opt/max?
  4. What is the role of the -s option, I know it’s to set the strict_type_constraints flag for int8 mode but I’m not sure what does that mean?
  1. The max_batch_size is used to allocate memory when generate TensorRT engine.
    It is not only relevant in int8 mode.
  2. No. Running “export” and running “tao-converter” are separated. The “export” can generate .etlt model and tensorrt engine. The “tao-converter” can generate tensorrt engine. The “-b” can be the same or different. They are not relevant.
  3. Usually your model already has the known width and height after training. For example, a 3x544X960 yolov4 model. If you run it in batch-size 4 with deepstream, you can find in the log that min_shape is 1x3x544x960 , max_shape is 4x3x544x960 . With tao-converter, you can generate a tensorrt engine with setting opt_shape to 1x3x544x960, 2x3x544x960, 3x3x544x960, 4x3x544x960.
  4. int8 + “-s” → set builder Flag: kINT8 and KSTRICT_TYPES
    int8 → set builder Flag: KINT8 and KFP16(if supported)
    See more info about KSTRICT_TYPES in TensorRT: nvinfer1 Namespace Reference
1 Like

@Morganh,

Thank you for the clarification. I a few follow up questions:

  1. If my deepstream app always has X input sources, should I set max_batch_size = X to get more optimised engine?
  2. The tao-converter says max_batch_size is not needed for .etlt models generated with dynamic shape. Are all models in tao generated with dynamic shape or it depends on the architecture? In other words, how do I know which .etlt models is generated with dynamic shape?
  3. So --batch_size and -b are not related to one another, in that case, tao-converter doc is wrong then cause it says “-b: The batch size used during the export step for INT8-calibration cache generation (default: 8).”, implying -b should be set to --batch_size?
  1. If you use deepstream app, it is a must to use tao-converter generate tensort engine. You can deploy .etlt model directly in deepstream-app. If you have X input sources, you can set batch-size=X in the deepstream config file.
  2. Refer to YOLOv4 — TAO Toolkit 3.22.05 documentation and Integrating TAO Models into DeepStream — TAO Toolkit 3.22.05 documentation . Can set dynamic shape when run tao-converter against the “Encrypted ONNX”
  1. The “-b” can set to the same.
1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.