SAHI - Slicing Aided Hyper Inference

I hope this message finds you well. I am writing to suggest the implementation of an exciting technology, SAHI (Slicing Aided Hyper Inference), in NVIDIA’s DeepStream platform.
Object detection and instance segmentation are by far the most important fields of applications in Computer Vision. However, detection of small objects and inference on large images are still major issues in practical usage. Here comes the SAHI to help developers overcome these real-world problems with many vision utilities.

Before suggesting SAHI, I would like to inquire if any similar functionality already exists within DeepStream. If DeepStream already offers comparable capabilities for object detection and instance segmentation, please accept my apologies for any redundancy in this request. However, if these features are not yet available or can be further improved, I believe integrating SAHI into DeepStream would be highly beneficial.

1 Like

DeepStream is a SDK. If you integrate object detection and instance segmentation models with DeepStream APIs, the functions will work. We already have sample for some objects detection and instance segmentation models. NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream (github.com)

Hi,
SAHI is not a model.
Slicing Aided Hyper Inference is a technique developed to tackle the challenge of small object detection. This revolves around a generic framework that leverages slicing in the fine-tuning and inference stages. By dividing input images into overlapping patches, this significantly increase the pixel area available for small objects compared to the original images processed by the network.

During the inference phase, the image is divided into smaller patches, and predictions are generated from larger resized versions of these patches. By working with these larger patches, we ensure that even the smallest objects are adequately represented in the predictions. Afterwards, these predictions are converted back into the original image coordinates following the application of Non-Maximum Suppression (NMS) for further refinement. Additionally, predictions from full inference can be optionally incorporated to improve the overall detection accuracy.

The SAHI technique, embedded within framework, offers a robust and efficient solution for small object detection. By integrating slicing in the fine-tuning and inference stages, thus enhance the network’s ability to effectively handle the challenges posed by small objects, leading to significant improvements in object detection performance.

Paper

1 Like

Thank you for sharing the good idea of SAHI!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.