gRPC Message sizes

ShantanuNair · September 21, 2021, 1:14pm

As a follow up to this thread - Inference Broken - Long Form Audio and gRPC max message sizes - #5 by AakankshaS

We can and have tried using the streaming inference API, but that means we completely loose out on the benefits of Offline/Batch inference. Is there a workaround we could use? Send in 4MB chunks of audio? I’m not sure how that would affect, say, citrine’s attention mechanism.

Are we not expected to run long-form audio on Riva as a use case? I thought with attention-based models like citrinet that that’s where they would shine.

If such batch/offline processing is not supported (at least for files >4MB) or not a pipeline you intend on supporting then please let us know, as we’re looking for a platform to serve that purpose primarily, and Jarvis/Riva seemed like a good fit for exactly that.

We would appreciate some/any indication on what you guys have in mind for the future regarding this. Yes, for now, we can use the streaming inference, but that makes sense for us as a stand-in only if long-form transcripts are eventually possible. Streaming and offline aren’t interchangeable, for our purposes.

It also looks like Triton supports configuring larger message size [1] and most other inference/training platforms support exposing this gRPC config[2]. Can we talk to Triton directly?

ShantanuNair · September 21, 2021, 7:46pm

Hey it looks like we’re receiving a standard response when it comes to inference on larger files - Jarvis ASR batch mode file size exceeds - #2 by SunilJB

Is it okay for us to ask for some transparency, at least in relation to whether the intention is to support larger files or not eventually? Would help us know what Riva aims to provide in terms of functionality and supported use cases. Right now we’d have to reevaluate because we simply don’t know if we’d be trying to get Riva to work in a non-primary use case down the line. If we were given this information or given some idea of the roadmap, it would help us make these decisions much better.

AakankshaS · September 22, 2021, 5:09am

Hi @ShantanuNair ,
Have responded to the original thread.

Thanks!

Topic		Replies	Views
Inference Broken - Long Form Audio and gRPC max message sizes Riva	10	2439	October 18, 2021
Audio2Face streaming Player Max Send / receive message length Audio2Face (closed)	6	896	August 10, 2023
Riva 21.10 release - delayed or out yet? Riva	2	530	November 1, 2021
Jarvis ASR batch mode file size exceeds Riva riva	1	846	May 3, 2021
The Riva TTS service is limited to < 400 characters long input strings Riva	4	1176	January 20, 2022
Offline/Batch broken on 1.8b due to 900s limit Riva	3	832	December 28, 2021
TensorRT InferenceServer Client TensorRT	4	754	October 12, 2021
Help with custom deploy and perform inference using citrinet-mandarin NGC pre-trained model in Riva Riva riva	6	1235	October 12, 2021
Using batch-size > 1 , inferserver on grpc doesnt return metadata for each stream, its mixed and flattened DeepStream SDK deepstream	14	1294	July 11, 2022
Streaming Inference fails intermittently with error: must specify the START flag on the first request of the sequence Riva	7	1440	July 28, 2024

gRPC Message sizes

Related topics