Unexpected High Egress from Riva Pods

We are running a deployment of Riva ASR using Triton Inference Server in a Kubernetes environment. When transitioning from a cloud provider that does not charge for egress to one that does, we noticed that the traffic sent from the Riva pods exceeded the traffic received. We use gRPC for communication between the client and the Riva server. Besides using compression, is there any way to control or minimize the amount of traffic returned from the Riva deployment?


Hardware - GPU (RTX A4000)
Hardware - CPU (4 Core)
Operating System (Official Riva Docker Image/Kubernetes)
Riva Version (2.13.1)

Regards, Yunus

Hi @yunus.andreasson

Thanks for your interest in Riva

I will check regarding this request with the Engineering Team and get back

Thanks