Error in riva deployment Riva deployment aborted

Hardware - GPU= RTX 3080 (driver version= 515.86.01)
Hardware - CPU= Intel® Xeon(R) W-2255 CPU @ 3.70GHz × 20
Operating System - Ubuntu 22.04.1 LTS
Riva Version- 2.7.0
NeMo Version - 1.15

Note: This may take some time, depending on the speed of your Internet connection.

Pulling Riva Speech Server images.
Image nvcr.io/nvidia/riva/riva-speech:2.7.0 exists. Skipping.
Image nvcr.io/nvidia/riva/riva-speech:2.7.0-servicemaker exists. Skipping.

  • [[ non-tegra != \t\e\g\r\a ]]
  • [[ non-tegra == \t\e\g\r\a ]]
  • echo ‘Converting RMIRs at /home/user/riva/byom/270model_repository/rmir to Riva Model repository.’
    Converting RMIRs at /home/user/riva/byom/270model_repository/rmir to Riva Model repository.
  • docker run --init -it --rm --gpus ‘“device=0”’ -v /home/user/riva/byom/270model_repository:/data -e MODEL_DEPLOY_KEY=tlt_encode --name riva-service-maker nvcr.io/nvidia/riva/riva-speech:2.7.0-servicemaker deploy_all_models /data/rmir /data/models
    /bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)

==========================
=== Riva Speech Skills ===

NVIDIA Release (build 46434655)
Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

To install Python sample dependencies, run /opt/tensorrt/python/python_setup.sh

To install the open-source samples corresponding to this TensorRT release version
run /opt/tensorrt/install_opensource.sh. To build the open source parsers,
plugins, and samples for current top-of-tree on master or a different branch,
run /opt/tensorrt/install_opensource.sh -b
See GitHub - NVIDIA/TensorRT: NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications. for more information.

/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
2023-02-24 17:22:13,738 [INFO] Writing Riva model repository to ‘/data/models’…
2023-02-24 17:22:13,738 [INFO] The riva model repo target directory is /data/models
2023-02-24 17:22:23,173 [INFO] Using obey-precision pass with fp16 TRT
2023-02-24 17:22:23,174 [INFO] Extract_binaries for nn → /data/models/riva-trt-conformer-en-US-asr-streaming-am-streaming/1
2023-02-24 17:22:23,174 [INFO] extracting {‘onnx’: (‘nemo.collections.asr.models.ctc_bpe_models.EncDecCTCModelBPE’, ‘model_graph.onnx’)} → /data/models/riva-trt-conformer-en-US-asr-streaming-am-streaming/1
2023-02-24 17:22:23,781 [INFO] Printing copied artifacts:
2023-02-24 17:22:23,781 [INFO] {‘onnx’: ‘/data/models/riva-trt-conformer-en-US-asr-streaming-am-streaming/1/model_graph.onnx’}
2023-02-24 17:22:23,781 [INFO] Building TRT engine from ONNX file
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:27:42] [TRT] [W] Output type must be INT32 for shape outputs
2023-02-24 17:27:43,097 [INFO] Mixed-precision net: 7115 layers, 7115 tensors, 1 outputs…
2023-02-24 17:27:43,273 [INFO] Mixed-precision net: 1592 layers / 1592 outputs fixed
[02/24/2023-17:30:14] [TRT] [W] Myelin graph with multiple dynamic values may have poor performance if they differ. Dynamic values are:
[02/24/2023-17:30:14] [TRT] [W] (# 0 (SHAPE audio_signal))
[02/24/2023-17:30:14] [TRT] [W] (# 0 (SHAPE length))
python3: /root/gpgpu/MachineLearning/myelin/src/compiler/optimizer/const_ppg.cpp:2815: void myelin::ir::copy_slice_data(myelinType_t, void*, const void*, const symbolic_shape_t&, const symbolic_shape_t&, const symbolic_shape_t&, const symbolic_shape_t&, const symbolic_shape_t&, const symbolic_shape_t&, const int_const_shape_t&): Assertion 0' failed. /usr/local/bin/deploy_all_models: line 21: 98 Aborted riva-deploy $FORCE find $rmir_path -name *.rmir -printf "%p:${MODEL_DEPLOY_KEY} "` $output_path

  • ‘[’ 134 -ne 0 ‘]’
  • echo ‘Error in deploying RMIR models.’
    Error in deploying RMIR models.
  • exit 1

I get this error by running the “riva_init.sh” in the Riva quick-start folder version 2.7.0. The model is a fine-tuned stt_en_conformer_ctc_large model using this training notebook in the Riva tutorials NeMo version 1.15.0. With only changes to config being data, learning rate, vocab.
My nemo2riva and riva-build commands do not give any errors.

My riva-build command before running riva_init.sh:
riva-build speech_recognition
/data/rmir/stt_en_conformer_ctc_large.rmir
/servicemaker-dev/Conformer-CTC-BPE.riva
–name=conformer-en-US-asr-streaming
–featurizer.use_utterance_norm_params=False
–featurizer.precalc_norm_time_steps=0
–featurizer.precalc_norm_params=False
–ms_per_timestep=40
–endpointing.start_history=200
–nn.fp16_needs_obey_precision_pass
–endpointing.residue_blanks_at_start=-2
–chunk_size=0.16
–left_padding_size=1.92
–right_padding_size=1.92
–decoder_type=greedy
–language_code=en-US

Any help on this would be much appreciated.

Hi @adjohnson

Thanks for your interest in Riva

I guess it is enough to mount the /data folder in start, in place of /data/models and /data/rmir that was used

Please verify the /data folder to be mounted has the models folder and subsequent required models inside it

i.e after your deploy

riva-deploy -f <rmir_filename>:<encryption_key> /data/models

Use

docker run -d --gpus 1 --init --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 \
        -v /data:/data             \
        -p 50051:50051                \
        -e "CUDA_VISIBLE_DEVICES=0"   \
        --name riva-speech                \
        nvcr.io/nvidia/riva/riva-speech:2.9.0 \
        start-riva  --riva-uri=0.0.0.0:50051 --nlp_service=true --asr_service=true --tts_service=true

please find the document reference
https://docs.nvidia.com/deeplearning/riva/user-guide/docs/model-overview.html#custom-models

Thanks

To update:
I ran the same deployment process using Riva 2.9.0 and the model successfully deployed and ran. My initial guess is that the documentation is wrong on what versions of NeMo that the models were created on is supported by Riva?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.