[Riva] - Is it possible to cache inferences for TTS

arnab2 · March 4, 2022, 2:36pm

Please provide the following information when requesting support.

Hardware - GPU (T4)
Hardware - CPU
Operating System
Riva Version
TLT Version (if relevant)

Currently we are trying to use Riva to infer Phrases to produce audio files. Is it possible to cache the response/result of same phrase at the Triton server level? If not are there any recommended way to do this type of caching.

sjunkin · March 8, 2022, 3:30pm

Caching is fairly application specific, would probably be different solution if you are mobile vs call center vs web. Riva team does not have a best practice for this currently. Can you outline your use case in more detail?

shivakumar.madhavan · March 9, 2022, 12:17pm

@sjunkin

Our use case is mobile application accessing RIVA ( or Triton) server for TTS.
The question from arnab is to understand if

A specific phrase/text has already been synthesized, is there recommended approach to caching this synthesized output if the phrase is exactly the same ?

“Hello, how are you ?” - Requested by 1st time results in synthesis
Requested subsequently results in the cached response

Is this supported or recommended approach ?

Thanks,
Shiva