TX2 + TensorRT Benchmarks for RNN/LSTM

I am at stage 0 (research) of trying to figure out how many audio streams a Jetson TX2 could simultaneously turn to text. I’m assuming my performance limitation will cap the number, but I cannot seem to find any benchmarks for common speech recognition algorithms like I can for the image recognition algorithms on page 18 here: https://images.nvidia.com/content/pdf/inference-technical-overview.pdf

Has this been done, and if so, can somebody point me to the work?

Is there any reason why simultaneous streams could not be interpreted?



Sorry that there is no available benchmark report on RNN for Jetson.
It’s recommended to test it with our cuDNN sample directly.

cp -r /usr/src/cudnn_samples_v7/ .
cd cudnn_samples_v7/RNN/

Modes: 0 = RNN_RELU, 1 = RNN_TANH, 2 = LSTM, 3 = GRU