Documentation for multi-model serving with overcommit on Triton

fmuhammed894 · April 24, 2023, 6:10am

I read in the seldon core documentation that multi-model serving with overcommit is available out of the box on nvidia triton

Can you please share documentation on how to configure and implement multi-model serving with overcommit using Nvida Triton

AakankshaS · April 24, 2023, 6:38am

Hi,

The below links might be useful for you.

For multi-threading/streaming, will suggest you to use Deepstream or TRITON

For more details, we recommend you raise the query in Deepstream forum.

or

raise the query in Triton Inference Server Github instance issues section.

Thanks!