I hope this message finds you well. I am writing to suggest the implementation of an exciting technology, SAHI (Slicing Aided Hyper Inference), in NVIDIA’s DeepStream platform.
Object detection and instance segmentation are by far the most important fields of applications in Computer Vision. However, detection of small objects and inference on large images are still major issues in practical usage. Here comes the SAHI to help developers overcome these real-world problems with many vision utilities.
Before suggesting SAHI, I would like to inquire if any similar functionality already exists within DeepStream. If DeepStream already offers comparable capabilities for object detection and instance segmentation, please accept my apologies for any redundancy in this request. However, if these features are not yet available or can be further improved, I believe integrating SAHI into DeepStream would be highly beneficial.
Hi,
SAHI is not a model.
Slicing Aided Hyper Inference is a technique developed to tackle the challenge of small object detection. This revolves around a generic framework that leverages slicing in the fine-tuning and inference stages. By dividing input images into overlapping patches, this significantly increase the pixel area available for small objects compared to the original images processed by the network.
During the inference phase, the image is divided into smaller patches, and predictions are generated from larger resized versions of these patches. By working with these larger patches, we ensure that even the smallest objects are adequately represented in the predictions. Afterwards, these predictions are converted back into the original image coordinates following the application of Non-Maximum Suppression (NMS) for further refinement. Additionally, predictions from full inference can be optionally incorporated to improve the overall detection accuracy.
The SAHI technique, embedded within framework, offers a robust and efficient solution for small object detection. By integrating slicing in the fine-tuning and inference stages, thus enhance the network’s ability to effectively handle the challenges posed by small objects, leading to significant improvements in object detection performance.