Speech recognition in TensorRT5

jianxiangm · January 4, 2019, 3:47am

Hi, i want to accelerate Deepspeech2 (CNN+RNN+FC) with TRT5, and i have some questions:

I found RNN layers in TRT only support tensorflow model, is it right? If so, is it necessary to train the network with tensorflow again?
I read the CharRNN sample and it need to dump TF weights, create RNN layers with parameters, convert TF weights to TRT weights and other things, which is completely different from directly converting a CNN model by parsing the model and converting to TRT engine with several APIs. So I wonder if it is necessary to convert the RNN network like this， which seems like complex？
Since the input shapes of CNN and RNN are different, can i build the network (CNN+RNN) in one TRT engine or two separate engines? And how can I implement it?
Thanks a lot.

NVES · January 4, 2019, 5:52pm

Hello,

No. TRT can accept trained models from in .pb, onnx, and several other framework formats. But you’ll have to extract weights from model and convert to TRT weights too.
This is because weights for each gate/layer need to be set separately for the RNN layer. TensorFlow weights are exported with each layer concatenated into a single WTS file. This example starts with a model trained in TensorFlow, a similar workflow should work to bring in weights from any framework of your choice.
I’m not familiar with Deepspeech2. Assuming your convolutional network extracts features and then feeds it to an LSTM/rnn cell, I think you have to reshape the CNN output into a time series sequence. Basically, to connect CNN with LSTM , the CNN output need to be distributed across time. So… I think you can and should build one TRT engine.

18810681636 · January 7, 2019, 7:29am

+1

jianxiangm · January 7, 2019, 7:32am

Thanks for your patient reply, but i am still confused how to implement it?
Suppose that the network is 2 CNN layers + reshape layers + two RNN layers, where all ops are supported by TRT and the network is trained in Tensorflow. Then when i want to convert the .uff file to TRT engine, which one do i need to do?
(1) creating the network definition from scratch using the TRT’s API like network->addInput, network->add_convolution, network->add_pooling, network->addRNNv2 like 2, and load convert weights from TF model to TRT layers.
(2) just directly use the UffParser and convertor API :
engine = trt.utils.uff_to_trt_engine(G_LOGGER, uff_model, parser, 1, 1 << 20)

NVES · January 7, 2019, 5:09pm

I think you can just go with #2.

jianxiangm · January 8, 2019, 10:38am

ok, I see, thanks a lot.
But Deepspeech2 contains several unsupported operations by TRT. Thus i am going to go with #1. Then i have some problems:

How to generate the WTS file? The TRT sample just loads weights from ‘xxx.wts’ file, but don’t show how to generate it?
I found both CNN and RNN samples of creating network definition and loading weights are implemented by C++ API, can I create RNN definition and load and convert weights from TF model with TRT Python API?
I only found the python sample of creating CNN definition in network_api_pytorch_mnist, and it looks like very easy. Just load weights with self.network.state_dict() and define the CNN network. But in C++ sample sampleMNISTAPI, it needs to define function loadWeights and .wts file, which is more complex than Python samples. So i wonder the reason is Python API is easier to create the network definition from scratch or it is only easier for pytorch model not TF?

jianxiangm · January 9, 2019, 2:15am

Any suggestions?

Topic		Replies	Views
TensorRT RNN support: Unable to locate convertTFRNNWtsToTRTWts.py (TensorRT4RC) TensorRT	0	445	May 9, 2018
How to transfer LSTM caffemodel to TensorRT weights DeepStream SDK	2	991	May 5, 2020
TF-TRT RNN NMT model optimise, Input tensor with shape [?,?] TensorRT	0	660	May 29, 2019
The difference between TF and TRT weights? TensorRT	7	1331	August 12, 2019
Video Tutorial: Introduction to Recurrent Neural Networks in TensorRT Technical Blog	0	336	August 21, 2022
Video: Introduction to Recurrent Neural Networks in TensorRT Technical Blog	1	430	January 5, 2020
Openai Whisper Tensorrt TensorRT tensorrt	9	2487	October 18, 2023
How to use Deep Stream with multiple model in single project ? help me DeepStream SDK	10	2141	October 12, 2021
Weights.h5 convert to trtexec engine TensorRT	5	2236	August 21, 2021
TensorRT: .wts format for tensorflow model weights in sampleCharRNN GPU-Accelerated Libraries	0	1083	August 22, 2017

Speech recognition in TensorRT5

Related topics