Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc) :
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc): Yolo_v4
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here): 3.21.11_trt8.4_x86
• Training spec file(If have, please share here): NA
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.): NA
Starting from Deepstream 6.1.1, Deepstream supports mixed precision engine using the layer-device-precision
property as shown in this config.
I’m currently using tao-converter
to build engines from tao models:
tao-converter -h
usage: tao-converter [-h] [-e ENGINE_FILE_PATH]
[-k ENCODE_KEY] [-c CACHE_FILE]
[-o OUTPUTS] [-d INPUT_DIMENSIONS]
[-b BATCH_SIZE] [-m MAX_BATCH_SIZE]
[-w MAX_WORKSPACE_SIZE] [-t DATA_TYPE]
[-i INPUT_ORDER] [-s] [-u DLA_CORE]
[-l engineLayerVerbose]
[-v TensorRT version]
[--precisionConstraints PRECISIONCONSTRAINTS]
[--layerPrecisions layerName:precision]
[--layerOutputTypes layerName:precision]
input_file
How do I specify the layer precision listed in the Deepstream’s yolov4_tao config using tao-converter
?
layer-device-precision: cls/mul:fp32:gpu;box/mul_6:fp32:gpu;box/add:fp32:gpu;box/mul_4:fp32:gpu;box/add_1:fp32:gpu;cls/Reshape_reshape:fp32:gpu;box/Reshape_reshape:fp32:gpu;encoded_detections:fp32:gpu;bg_leaky_conv1024_lrelu:fp32:gpu;sm_bbox_processor/concat_concat:fp32:gpu;sm_bbox_processor/sub:fp32:gpu;sm_bbox_processor/Exp:fp32:gpu;yolo_conv1_4_lrelu:fp32:gpu;yolo_conv1_3_1_lrelu:fp32:gpu;md_leaky_conv512_lrelu:fp32:gpu;sm_bbox_processor/Reshape_reshape:fp32:gpu;conv_sm_object:fp32:gpu;yolo_conv5_1_lrelu:fp32:gpu;concatenate_6:fp32:gpu;yolo_conv3_1_lrelu:fp32:gpu;concatenate_5:fp32:gpu;yolo_neck_1_lrelu:fp32:gpu
The closest option in tao-converter
seem to be --layerPrecisions
, but I can’t find any specific example, I don’t know if --layerPrecisions
is the equivalent of layer-device-precision
?