Model optimizations to consider for Citrinet (and speech recognition in general)

Hardware - GPU T4
Operating System: Docker (Riva Server Image)
Riva Version: 1.6.0

Other than the current default optimizations performed for speech recognition models (dynamic batching and sequencing), are there any suggestions regarding instance group counts and increasing the max batch size?

The RIVA docs highlight the potential performance of each model (here) but don’t go into detail about which flags to consider passing to riva-build for each model.

Some insight would be appreciated!

Hi @pineapple9011
I am not sure if I understood your query correctly.
Could you please check below link in case it’s helpful in your case?