Customize TensorRT 4 LSTM

Hi,

I’m using TensorRT 4.0 with LSTM on our Tesla cluster. But I encounter the following problem.

We are using kLSTM denoted in the following equation
i[t] := sigmoid(W[i].X[t] + R[i].H[t-1] + Wb[i] + Rb[i])
f[t] := sigmoid(W[f].X[t] + R[f].H[t-1] + Wb[f] + Rb[f])
o[t] := sigmoid(W[o].X[t] + R[o].H[t-1] + Wb[o] + Rb[o])
c[t] := tanh(W[c].X[t] + R[c].H[t-1] + Wb[c] + Rb[c])

C[t] := f[t]*C[t-1] + i[t]*c[t]
H[t] := o[t]*tanh(C[t])

But there’s a small difference as we are using ReLU instead of tanh in
c[t] := ReLU(W[c].X[t] + R[c].H[t-1] + Wb[c] + Rb[c])

We found the original equation in this page
https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/namespacenvinfer1.html#ace7b656a1c0537ea0edd17cf61121200

Some papers including Baidu’s DeepSpeech 2 are using ReLU instead of tanh. (You can find code in their PaddlePaddle github repo).

So my questions are

  1. How can we use ReLU instead of tanh in the equation of c[t] in TensorRT 4.x?
  2. If it’s impossible in 4.x, is there any plan or future version for TensorRT to support this customization?
  3. If TRT won’t support this customization, how can we implement it?

Thank you

Hello,

NVIDIA engineering is reviewing your questions and will keep you updated.

Hi hiprince,

For question 3, it should be possible to build out your own LSTM implementation with your required customization, though it would take a little bit of work. For example, all you need are IActivationLayer (with various ActivationTypes) for the sigmoid and tanhs, IFullyConnected for the matrix multiplies, and IElementWise for the vector additions.

The caffe implementation might be interesting to look at: caffe/lstm_layer.cpp at master · BVLC/caffe · GitHub
They build their LSTM layers out of simpler pieces like I suggested above.

Cheers,
Tom

Hello,

We are sorry that ReLU is not supported in the equation of c[t] in TensorRT 4. We are always reviewing requests from the user community and are discussing internally. Unfortunately, we cannot share more information about further release here. Please stay tuned for future announcements.

regarding workaround, it would be to create a plugin layer that implements this support (per tom.peters suggestion).