TTS Synthesize Online randomly fails with a Streaming timed out


  • Hardware - GPU A40
  • Hardware - CPU Intel(R) Xeon(R) Gold 6354 CPU @ 3.00GHz
  • Operating System - Ubuntu 22.04.4
  • Nvidia driver version: 550.54.15
  • Cuda version: 12.4
  • Riva Version - 2.15.0
  • Riva Python Client Version - 2.15.0
  • Docker version - 25.0.4

Steps to reproduce:

  1. Deploy Riva Quickstart
  2. Start sending synthesize_online requests to the Riva server (sample rate - 44100)

Some of the TTS requests are failed with a Streaming timed out error. It seems to happen mostly with short texts (with a few words). Previously, we used Riva 2.11.0 on this VM, and it worked well without any issues.
Here is an error from a client side:

grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.UNKNOWN
        details = "Error: Triton model failed during inference. Error message: Streaming timed out"
        debug_error_string = "UNKNOWN:Error received from peer ipv4:*.*.*.*:50051 {grpc_message:"Error: Triton model failed during inference. Error message: Streaming timed out", grpc_status:2, created_time:"2024-03-29T08:25:10.895145573+00:00"}"

Hope these files could help during the investigation.
nvidia-smi.log (1.7 KB)
riva-speech.log (4.4 MB) (18.9 KB)

Any help on this would be much appreciated.
Thanks in advance!

Hi @vbilous , unfortunately this is a know issue as you can see here.

1 Like