YOLOv9 - Assistance Needed for Implementing Quantization in ADown Class


Hello Community/Nvidia,

I recently implemented quantization for the YOLOv9 model using TensoRT (GitHub - levipereira/yolov9-qat: Implementation of YOLOv9 QAT optimized for deployment on TensorRT platforms.) as part of an effort to enhance its efficiency and performance. However, I’ve encountered a specific challenge when attempting to quantize the ADown class in the model.

The ADown class is responsible for performing downsampling, but upon implementing quantization in this class, I noticed a significant increase in the model’s latency, possibly due to the generation of reformat operations.

To overcome this obstacle, I’m seeking the help and insights of the community. If anyone has experience with model quantization and can provide guidance on how to approach quantizing the ADown class, I would greatly appreciate it.

Here is the link to the issue in the repository: Issue #3 - Implement Quantization in ADown Class

Any contributions, suggestions, or shared experiences will be valued. Thank you for your attention and collaboration!

Best regards

We have successfully implemented it. Closing the discussion.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.