sample_nmt segmentation fault


I am trying to run the sample_nmt provided in TensorRT 4. I followed the README and downloaded the vocab data and weights. When I attempt to run the sample I get the following

./sample_nmt --data_dir=/home/tensorrt/TensorRT-
data_dir: /home/tensorrt/TensorRT-
Component Info:

  • Data Reader: Text Reader, vocabulary size = 36548
  • Input Embedder: SLP Embedder, num inputs = 36548, num outputs = 512
  • Output Embedder: SLP Embedder, num inputs = 36548, num outputs = 512
  • Encoder: LSTM Encoder, num layers = 2, num units = 512
  • Decoder: LSTM Decoder, num layers = 2, num units = 512
  • Alignment: Multiplicative Alignment, source states size = 512, attention keys size = 512
  • Context: Ragged softmax + Batch GEMM
  • Attention: SLP Attention, num inputs = 1024, num outputs = 512
  • Projection: SLP Projection, num inputs = 512, num outputs = 36548
  • Likelihood: Softmax Likelihood
  • Search Policy: Beam Search Policy, beam = 5
  • Data Writer: BLEU Score Writer, max order = 4
    End of Component Info
    Segmentation fault (core dumped)

I checked the md5sum of the vocab data. The and en are the same but the vocab.bpe.32000.en and de are b748c9ac3f3aefa5e2286397f03dfdfb (not c1d0ca6d4994c75574f28df7c9e8253f per the instructions

My initial guess is that the vocab is slightly different from when it was tested before (according do the manual March 26, 2018). I tried downloading the data directly from google/seq2seq ( but the packaged vocab is again different (md5sum: 2f2dea8696324078749b750d0ceff8c2)

Anyone else try and run the sample_nmt?

Does anyone have the vocab files who md5sum match the docs?


the TensorRT team is looking into this. Will post back on this thread as soon as the team completes our investigation.

