Getting similar latency when I do load test with 1x T4 and 2x T4 GPU for Riva Streaming ASR

I have deployed the Riva ASR using It’s perfectly working. But when I’m scaling it with 2 T4 GPU I’m not greeting performance speed up.

I perform the load test using tool riva_streaming_asr_client that mention in the asr performance documentation

I run this tool for 1 T4 GPU and 2 T4 GPU but in both cases I’m getting similar average latency. Could you please give us any clue what could go wrong?

I checked both GPU node logs using the kubectl command and found the traffic is coming in both GPU node

I’m using
Riva Version: 2.12.1