Accelerating Inference with NVIDIA Triton Inference Server and NVIDIA DALI

Originally published at: https://developer.nvidia.com/blog/accelerating-inference-with-triton-inference-server-and-dali/

When you are working on optimizing inference scenarios for the best performance, you may underestimate the effect of data preprocessing. These are the operations required before forwarding an input sample through the model. This post highlights the impact of the data preprocessing on inference performance and how you can easily speed it up on the…

Though the article explains how to decode images on server side very well, and it worked like charm, but I am struggling to find a similar thing for videos.

Inferencing on video datasets is even more network intensive, and with the likes of ffmpeg-python, I am able to encode a 32/64 frame sequences into an encoded h264 byte stream. If this can be decoded in dali on server side to make the required NTHWC tensor, it would be great.
After spending hours, I could only find Video readers that read from file, but nothing that can read videos from external_source

Hi @sufiyan,

Thank you for checking DALI. That is true, it doesn’t support video decoding with TRITON.
What you can do is to check out DeepStream which aims to handle video streaming and provides integration with TRITON.