Is there any way to use different VLM models on single VSS instance?

I would like to use different versions of the GPT model for different requests. Is it possible to do this on a single VSS instance, or do I need to set up my own VSS instance for each version? Specifically, I would like to use different models for processing video streams and files.

Thanks!

Try setting the VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME environment variable before calling the /summarize api, then specify the api_type in the parameters.

This usually works, if not, try starting multiple instances

Refer to these links.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.