RIVA 2.11.0, nemo:23.03, unable to deploy NMT

Please provide the following information when requesting support.

Hardware - GPU T4
Riva Version 2.11.0
Nemo Version 23.03

Reproduce as:

  • start nemo container nemo:23.03
  • install RIVA QuickStart 2.11.0
  • download en_de_24x6.nemo from NMT En - De Transformer24x6 | NVIDIA NGC
  • run command nemo2riva en_de_24x6.nemo --out en_de_24x6.riva, it will fail with
Traceback (most recent call last):
  File "/usr/local/bin/nemo2riva", line 8, in <module>
    sys.exit(nemo2riva())
  File "/usr/local/lib/python3.8/dist-packages/nemo2riva/cli/nemo2riva.py", line 49, in nemo2riva
    Nemo2Riva(args)
  File "/usr/local/lib/python3.8/dist-packages/nemo2riva/convert.py", line 79, in Nemo2Riva
    artifacts, manifest = get_artifacts(restore_path=nemo_in, model=model, passphrase=key, **patch_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/nemo2riva/artifacts.py", line 104, in get_artifacts
    patch(model, artifacts, **patch_kwargs)
TypeError: change_tokenizer_names() got an unexpected keyword argument 'import_config'

Expected behaviour, a successful export of a riva file, as per Custom Models — NVIDIA Riva.

The issue is in
/usr/local/lib/python3.8/dist-packages/nemo2riva/patches/mtencdec.py where change_tokenizer_names is missing **kwargs, i.e. it shout be

def change_tokenizer_names(model, artifacts, **kwargs):

The mentioned fix allows me to safely deploy en_de_24x6.nemo, but I have problems deploying a custom nmt model built with nemo:23.03.

Riva does not start up, it lists the following error:

UNAVAILABLE: Invalid argument: model 'nmt-classifier', tensor 'log_probs': the model expects 3 dimensions (shape [-1,-1,64000]) but the model configuration specifies 3 dimensions (an initial batch dimension because max_batch_size > 0 followed by the explicit tensor shape, making complete shape [-1,-1,32000])

I’ve tracked the issue down to line 96 in /usr/local/lib/python3.8/dist-packages/servicemaker/triton/nmt.py, which has a hardcoded value [-1,32000] for the classifier output.

However, I can not find why my custom built model has a value of 64000 (if I modify the classifier .pbtxt by hand the model loads and works as it should). Is this the size of the target language tokenizer (I use 64000 tokens).

Any help appreciated.