Nvidia docker python Deep learning example GNMT

Hi
I was able to build the GNMT model with a V100 following the steps in
https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Translation/GNMT

However, when I ran the inference test with the following command line
python3 translate.py --input data/wmt16_de_en/newstest2014.tok.bpe.32000.en
–reference data/wmt16_de_en/newstest2014.de --output /tmp/output
–model results/gnmt/model_best.pth --batch-size 32 128 512
–beam-size 1 2 5 10 --math fp16 fp32

I get the following error:
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration:
GPU 0: Tesla T4

Nvidia driver version: 410.78
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.1

Versions of relevant libraries:
[pip] Could not collect
[conda] torch 1.0.0a0
[conda] torchtext 0.3.0
[conda] torchvision 0.2.1
0: Run arguments: Namespace(batch_first=True, batch_size=[32, 128, 512], beam_size=[1, 2, 5, 10], bleu=True, cov_penalty_factor=0.1, cuda=True, cudnn=True, dataset_dir=‘data/wmt16_de_en/’, env=True, input=‘data/wmt16_de_en/newstest2014.tok.bpe.32000.en’, len_norm_const=5.0, len_norm_factor=0.6, local_rank=0, math=[‘fp32’], max_seq_len=80, model=‘results/gnmt/model_best.pth’, output=‘/tmp/output’, print_freq=1, rank=0, reference=‘data/wmt16_de_en/newstest2014.de’, sort=True)
0: Restoring state of the tokenizer
0: math: fp32, batch size: 32, beam size: 1
0: Processing data from data/wmt16_de_en/newstest2014.tok.bpe.32000.en
0: Running evaluation on test set
Traceback (most recent call last):
File “translate.py”, line 193, in
main()
File “translate.py”, line 189, in main
reference_path=args.reference, summary=True)
File “/workspace/gnmt/seq2seq/inference/inference.py”, line 120, in run
output = self.evaluate(epoch, iteration, summary)
File “/workspace/gnmt/seq2seq/inference/inference.py”, line 208, in evaluate
detok = self.tokenizer.detokenize(pred)
File “/workspace/gnmt/seq2seq/data/tokenizer.py”, line 98, in detokenize
detok = delim.join([self.idx2tok[idx] for idx in inputs])
File “/workspace/gnmt/seq2seq/data/tokenizer.py”, line 98, in
detok = delim.join([self.idx2tok[idx] for idx in inputs])
KeyError: 2