SampleNMT computes incorrect attention vectors in TensorRT

It seems that sampleNMT incorrectly computes the attention vectors from the context vectors. There should be a tanh non-linear activation layer after W[c:h], as shown in Equation 3 To fix the bug, you should add a tanh activation at the end of SLPAttention::addToModel.

Thank you for the feedback. I’ll bring this to our engineering team’s attention.


our engineers have committed the fix (Add missing tanh to sampleNMT attention) and should be available in a future release.

thank you