TensorRT Warning for tao converter "Half2 support requested on hardware without native FP16 support" when converting YOLOv4 models with INT8 precision

Problem

We get the following TensorRT warning when converting YOLOv4 .etlt models to TensorRT engines with INT8 precision using tao converter.

[WARNING] Half2 support requested on hardware without native FP16 support, performance will be negatively affected

This happens only on GPUs that doesn’t support FP16. However, since the engine is converted to INT8, while having INT8 support is required, having FP16 Support should not be required.

Does anyone knows whats the reason for this? And should this be a concern as the warning also suggests performance will be negatively affected?

Information

• Hardware - GTX 1060 (GPU_ARCHS = 6.1)
• Network Type - Yolo_v4(CSPDarknet53)
• TAO Version - 3.21.08

What I found so far/ Other info

I have seen that we will get this same warning from TensorRT when using trtexec to convert ONNX model to a TensorRT engine with FP16 precision on a GPU that doesn’t support FP16. But here we are converting to INT8, so this shouldn’t be the case. So I am wondering why we get this TensorRT warning with tao converter/ how tao converter is calling TensorRT engine builder/ converter under the hood?

TensorRT documentation on mixed precision mentions the following:

For the best performance, you can choose to enable all three precisions by enabling FP16 and INT8 precision mode explicitly. You can also choose to execute trtexec with the “–best” option directly, which would enable all supported precisions for inference resulting in best performance.

Is is possible tao converter is doing something like this under the hood (just my speculation)?

This warning comes from TRT.
It indicates the hardware platforms does not support FP16 acceleration. See the support matrix here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html#hardware-precision-matrix

You can compare your GPU’s compute capability to the tables here (CUDA GPUs | NVIDIA Developer) with the support matrix to see if yours is supported.

Thanks for the quick reply.

This warning comes from TRT. It indicates the hardware platforms does not support FP16 acceleration
Yes. This is true as I have mentioned in my question.

See the support matrix here
Hardware - GTX 1060 (GPU_ARCHS = 6.1) has INT8 support and does not have FP16 support. However, as I have mentioned in my question,
this happens only on GPUs that doesn’t support FP16. However, since the engine is converted to INT8, while having INT8 support is required, having FP16 Support should not be required

To clarify, my question is :
why we are getting a warning about FP16 not being supported when trying to convert a model with INT8 precision, as only INT8 support is required for this.

I am wondering why we get this warning and how the engine builder is configured to get this warning and if this is okay (as the warning suggests performance will be negatively affected)

In tao-converter, if set “-t int8”, it will enable Int8 layer selection, and also kFP16 is specified because it is necessary to allow FP16/FP32 fallback if one’s device does not support INT8.
So this warning is prompted since your device does not support FP16.
If you want to ignore this message, you can add “-s” in the command line.
“-s” is a Boolean to apply TensorRT strict type constraints when building the TensorRT engine.

1 Like

thanks. This is exactly what I wanted to know: if it has been configured to fall back to FP16 in addition to FP32 when we set -t int8

In my undertand, based on TensorRT documentation, setting strict int8 type (without letting it fallback to FP32) is not optional, so I thought its good if there an option to set int8 as the precision without setting -s so that it will use INT8/FP32 mixed precision. Please let me know if my understanding is not correct.

However, I tested with -s (during export and convert step both) option with a model trained with QAT, and got the same accuracy. So setting strict type didn’t had a negative impact on results as I assumed. However not sure if thats because the model was trained with QAT.
Using -s made the Half2 support warning goes away, but it prompted additional warnings (in addition to the ones mentioned here (Missing dynamic range for tensor warnings when converting YOLOv4 models with INT8 precision - #6 by kgksl)

the additional warnings are as follows:

[WARNING] No implementation of layer b3_final_trans_bn obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer b3_final_trans_mish_clip obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer b4_final_trans_bn obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer b4_final_trans_mish_clip obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer b5_final_trans_mish/Relu6:0_pooling_1 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer b5_final_trans_mish/Relu6:0_pooling_2 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer conv_big_object obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer Resize obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer conv_mid_object obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer Resize1 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer conv_sm_object obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer bg_anchor/Const:0 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer bg_anchor/Identity:0_tile obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer bg_anchor/Reshape_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer Transpose + bg_reshape + bg_bbox_processor/Reshape_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(PWN(bg_bbox_processor/Sigmoid, bg_bbox_processor/mul/y:0 + (Unnamed Layer* 294) [Shuffle] + bg_bbox_processor/mul), PWN(bg_bbox_processor/sub/y:0 + (Unnamed Layer* 299) [Shuffle], bg_bbox_processor/sub)) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(PWN(PWN(bg_bbox_processor/add/y:0 + (Unnamed Layer* 290) [Shuffle] + bg_bbox_processor/add, bg_bbox_processor/sub_1), bg_bbox_processor/Minimum_min), bg_bbox_processor/Exp) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer bg_bbox_processor/Reshape_1_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer md_anchor/Const:0 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer md_anchor/Identity:0_tile obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer md_anchor/Reshape_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer Transpose1 + md_reshape + md_bbox_processor/Reshape_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(PWN(md_bbox_processor/Sigmoid, md_bbox_processor/mul/y:0 + (Unnamed Layer* 410) [Shuffle] + md_bbox_processor/mul), PWN(md_bbox_processor/sub/y:0 + (Unnamed Layer* 415) [Shuffle], md_bbox_processor/sub)) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(PWN(PWN(md_bbox_processor/add/y:0 + (Unnamed Layer* 406) [Shuffle] + md_bbox_processor/add, md_bbox_processor/sub_1), md_bbox_processor/Minimum_min), md_bbox_processor/Exp) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer md_bbox_processor/Reshape_1_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer sm_anchor/Const:0 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer sm_anchor/Identity:0_tile obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer sm_anchor/Reshape_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer Transpose2 + sm_reshape + sm_bbox_processor/Reshape_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(PWN(sm_bbox_processor/Sigmoid, sm_bbox_processor/mul/y:0 + (Unnamed Layer* 513) [Shuffle] + sm_bbox_processor/mul), PWN(sm_bbox_processor/sub/y:0 + (Unnamed Layer* 517) [Shuffle], sm_bbox_processor/sub)) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(PWN(PWN(sm_bbox_processor/add/y:0 + (Unnamed Layer* 510) [Shuffle] + sm_bbox_processor/add, sm_bbox_processor/sub_1), sm_bbox_processor/Minimum_min), sm_bbox_processor/Exp) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer sm_bbox_processor/Reshape_1_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer cls/Reshape_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(cls/Sigmoid_1, PWN(cls/Sigmoid, cls/mul)) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer box/Reshape_reshape obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(box/mul_1, box/add_1) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(box/mul_3, box/mul_4/x:0 + (Unnamed Layer* 774) [Shuffle] + box/mul_4) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer box/sub obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(box/mul, box/add) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer PWN(box/mul_2, box/mul_6/x:0 + (Unnamed Layer* 778) [Shuffle] + box/mul_6) obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer box/sub_1 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer box/add_2 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer box/add_3 obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation of layer BatchedNMS_N obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[WARNING] No implementation obeys reformatting-free rules, at least 1 reformatting nodes are needed, now picking the fastest path instead.

However, as I mentioned, despite the above warnings, I got the same accuracy with -s as without -s and it was similar to FP32 original accuracy when the model was trained with QAT.

That warning info will not affect your model’s accuracy.
When you set -t int8 and if your dgpu supports INT8, the trt engine generation will enable Int8 layer selection and allow fp16/fp32 fallback.
When you set -t int8 and if your dgpu does not support INT8, the trt engine generation will fall to fp16/fp32.