Is it possible to query <Riva server> a list of deployed models <from client>?

Since it is possible to deploy multiple ASR models in a single Riva instance (according to riva-build documentation), and a client can select a specific model (according to protobuf), how can a client be made aware of available models? I thought I would find the answer in health.proto, but it does not seem to expose such API.

Or is it not a good practice to deploy multiple ASR models per Riva instance?

2 Likes

I believe this is possible using Triton’s API (See official docs here server/extension_model_repository.md at main · triton-inference-server/server · GitHub)

But that returns all the models in the repo and my understanding is that Riva uses ensemble models, so there are more Triton “models” than there are actual ASR models deployed.

It would be great to have this as first party support within Riva since choosing ensemble models via Triton is not the best way to go about it IMO.

1 Like

Thanks for giving me something to work with! I’m surprised Riva has no support for this. Hope they consider it.

Hi @alena.kazakova

Thanks for your interest in Riva

Apologies, Currently at the moment we do not have any API/Query from client side to get a list of deployed models,

However from server side, we can get the list of models using docker logs riva-speech

Thanks @pineapple9011 for your valuable inputs