sample_nmt segmentation fault

Hello,

I am trying to run the sample_nmt provided in TensorRT 4. I followed the README and downloaded the vocab data and weights. When I attempt to run the sample I get the following

./sample_nmt --data_dir=/home/tensorrt/TensorRT-4.0.1.6/samples/sampleNMT/data/deen/
data_dir: /home/tensorrt/TensorRT-4.0.1.6/samples/sampleNMT/data/deen/
Component Info:

  • Data Reader: Text Reader, vocabulary size = 36548
  • Input Embedder: SLP Embedder, num inputs = 36548, num outputs = 512
  • Output Embedder: SLP Embedder, num inputs = 36548, num outputs = 512
  • Encoder: LSTM Encoder, num layers = 2, num units = 512
  • Decoder: LSTM Decoder, num layers = 2, num units = 512
  • Alignment: Multiplicative Alignment, source states size = 512, attention keys size = 512
  • Context: Ragged softmax + Batch GEMM
  • Attention: SLP Attention, num inputs = 1024, num outputs = 512
  • Projection: SLP Projection, num inputs = 512, num outputs = 36548
  • Likelihood: Softmax Likelihood
  • Search Policy: Beam Search Policy, beam = 5
  • Data Writer: BLEU Score Writer, max order = 4
    End of Component Info
    Segmentation fault (core dumped)

I checked the md5sum of the vocab data. The newstest2015.tok.bpe.32000.de and en are the same but the vocab.bpe.32000.en and de are b748c9ac3f3aefa5e2286397f03dfdfb (not c1d0ca6d4994c75574f28df7c9e8253f per the instructions https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#nmt_prepare)

My initial guess is that the vocab is slightly different from when it was tested before (according do the manual March 26, 2018). I tried downloading the data directly from google/seq2seq (seq2seq/nmt.md at master · google/seq2seq · GitHub) but the packaged vocab is again different (md5sum: 2f2dea8696324078749b750d0ceff8c2)

Any ideas?

Thanks,

Andy

bump?

Anyone else try and run the sample_nmt?

Does anyone have the vocab files who md5sum match the docs?

@andrewschenck,

the TensorRT team is looking into this. Will post back on this thread as soon as the team completes our investigation.

best regards,
Siddharth