Error in riva deployment Riva deployment aborted

adjohnson · February 24, 2023, 6:31pm

Hardware - GPU= RTX 3080 (driver version= 515.86.01)
Hardware - CPU= Intel® Xeon(R) W-2255 CPU @ 3.70GHz × 20
Operating System - Ubuntu 22.04.1 LTS
Riva Version- 2.7.0
NeMo Version - 1.15

Note: This may take some time, depending on the speed of your Internet connection.

Pulling Riva Speech Server images.
Image nvcr.io/nvidia/riva/riva-speech:2.7.0 exists. Skipping.
Image nvcr.io/nvidia/riva/riva-speech:2.7.0-servicemaker exists. Skipping.

[[ non-tegra != \t\e\g\r\a ]]
[[ non-tegra == \t\e\g\r\a ]]
echo ‘Converting RMIRs at /home/user/riva/byom/270model_repository/rmir to Riva Model repository.’
Converting RMIRs at /home/user/riva/byom/270model_repository/rmir to Riva Model repository.
docker run --init -it --rm --gpus ‘“device=0”’ -v /home/user/riva/byom/270model_repository:/data -e MODEL_DEPLOY_KEY=tlt_encode --name riva-service-maker nvcr.io/nvidia/riva/riva-speech:2.7.0-servicemaker deploy_all_models /data/rmir /data/models
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)

==========================
=== Riva Speech Skills ===

NVIDIA Release (build 46434655)
Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

To install Python sample dependencies, run /opt/tensorrt/python/python_setup.sh

To install the open-source samples corresponding to this TensorRT release version
run /opt/tensorrt/install_opensource.sh. To build the open source parsers,
plugins, and samples for current top-of-tree on master or a different branch,
run /opt/tensorrt/install_opensource.sh -b
See GitHub - NVIDIA/TensorRT: NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications. for more information.

/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
2023-02-24 17:22:13,738 [INFO] Writing Riva model repository to ‘/data/models’…
2023-02-24 17:22:13,738 [INFO] The riva model repo target directory is /data/models
2023-02-24 17:22:23,173 [INFO] Using obey-precision pass with fp16 TRT
2023-02-24 17:22:23,174 [INFO] Extract_binaries for nn → /data/models/riva-trt-conformer-en-US-asr-streaming-am-streaming/1
2023-02-24 17:22:23,174 [INFO] extracting {‘onnx’: (‘nemo.collections.asr.models.ctc_bpe_models.EncDecCTCModelBPE’, ‘model_graph.onnx’)} → /data/models/riva-trt-conformer-en-US-asr-streaming-am-streaming/1
2023-02-24 17:22:23,781 [INFO] Printing copied artifacts:
2023-02-24 17:22:23,781 [INFO] {‘onnx’: ‘/data/models/riva-trt-conformer-en-US-asr-streaming-am-streaming/1/model_graph.onnx’}
2023-02-24 17:22:23,781 [INFO] Building TRT engine from ONNX file
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:22:26] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[02/24/2023-17:27:42] [TRT] [W] Output type must be INT32 for shape outputs
2023-02-24 17:27:43,097 [INFO] Mixed-precision net: 7115 layers, 7115 tensors, 1 outputs…
2023-02-24 17:27:43,273 [INFO] Mixed-precision net: 1592 layers / 1592 outputs fixed
[02/24/2023-17:30:14] [TRT] [W] Myelin graph with multiple dynamic values may have poor performance if they differ. Dynamic values are:
[02/24/2023-17:30:14] [TRT] [W] (# 0 (SHAPE audio_signal))
[02/24/2023-17:30:14] [TRT] [W] (# 0 (SHAPE length))
python3: /root/gpgpu/MachineLearning/myelin/src/compiler/optimizer/const_ppg.cpp:2815: void myelin::ir::copy_slice_data(myelinType_t, void*, const void*, const symbolic_shape_t&, const symbolic_shape_t&, const symbolic_shape_t&, const symbolic_shape_t&, const symbolic_shape_t&, const symbolic_shape_t&, const int_const_shape_t&): Assertion 0' failed. /usr/local/bin/deploy_all_models: line 21: 98 Aborted riva-deploy $FORCE find $rmir_path -name *.rmir -printf "%p:${MODEL_DEPLOY_KEY} "` $output_path

‘[’ 134 -ne 0 ‘]’
echo ‘Error in deploying RMIR models.’
Error in deploying RMIR models.
exit 1

I get this error by running the “riva_init.sh” in the Riva quick-start folder version 2.7.0. The model is a fine-tuned stt_en_conformer_ctc_large model using this training notebook in the Riva tutorials NeMo version 1.15.0. With only changes to config being data, learning rate, vocab.
My nemo2riva and riva-build commands do not give any errors.

My riva-build command before running riva_init.sh:
riva-build speech_recognition
/data/rmir/stt_en_conformer_ctc_large.rmir
/servicemaker-dev/Conformer-CTC-BPE.riva
–name=conformer-en-US-asr-streaming
–featurizer.use_utterance_norm_params=False
–featurizer.precalc_norm_time_steps=0
–featurizer.precalc_norm_params=False
–ms_per_timestep=40
–endpointing.start_history=200
–nn.fp16_needs_obey_precision_pass
–endpointing.residue_blanks_at_start=-2
–chunk_size=0.16
–left_padding_size=1.92
–right_padding_size=1.92
–decoder_type=greedy
–language_code=en-US

Any help on this would be much appreciated.

rvinobha · February 27, 2023, 7:25pm

Hi @adjohnson

Thanks for your interest in Riva

I guess it is enough to mount the /data folder in start, in place of /data/models and /data/rmir that was used

Please verify the /data folder to be mounted has the models folder and subsequent required models inside it

i.e after your deploy

riva-deploy -f <rmir_filename>:<encryption_key> /data/models

Use

docker run -d --gpus 1 --init --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 \
        -v /data:/data             \
        -p 50051:50051                \
        -e "CUDA_VISIBLE_DEVICES=0"   \
        --name riva-speech                \
        nvcr.io/nvidia/riva/riva-speech:2.9.0 \
        start-riva  --riva-uri=0.0.0.0:50051 --nlp_service=true --asr_service=true --tts_service=true

please find the document reference
https://docs.nvidia.com/deeplearning/riva/user-guide/docs/model-overview.html#custom-models

Thanks

adjohnson · February 27, 2023, 7:42pm

To update:
I ran the same deployment process using Riva 2.9.0 and the model successfully deployed and ran. My initial guess is that the documentation is wrong on what versions of NeMo that the models were created on is supported by Riva?

system · March 13, 2023, 7:42pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
RIVA error, when deploying official Conformer ASR network Riva riva	10	1943	January 27, 2023
RIVA v2.15.0 fails to build NeMo model Riva	0	394	March 30, 2024
NGC RMIRs Error in downloading models Riva riva	17	1101	February 26, 2024
Not able to run LM fine tuned qurtznet model Riva riva	13	1264	October 8, 2021
Riva quickstart 2.11 fails on xavier nx Riva	3	911	June 29, 2023
Riva 1.8 riva_start.sh fail when build with language model Riva riva	3	1169	July 27, 2022
Riva Build fails for finetuned conformer NeMo models with batch size 1 Riva	2	747	November 1, 2022
Encounter "Unsupported model IR version: 9, max supported IR version: 8" during deploy custom model in riva for TTS Riva onnx , riva	9	3240	January 22, 2024
Riva_start.sh will not start the server Riva riva	4	1110	August 31, 2023
Triton server died before reaching ready state. Terminating Riva startup Riva	15	7588	November 8, 2023

Error in riva deployment Riva deployment aborted

========================== === Riva Speech Skills ===

Related topics

==========================
=== Riva Speech Skills ===