RIVA error, when deploying official Conformer ASR network

balintcenturio · June 7, 2022, 12:59pm

Please provide the following information when requesting support.

Hardware - GPU (A100/A30/T4/V100): NVIDIA RTX A6000
Hardware - CPU: AMD Ryzen 9 5900X 12-Core Processor
Operating System: Ubuntu 20.04
Riva Version: 2.2.0
TLT Version (if relevant)
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)
I tried to build and deploy the STT En Conformer-CTC XLarge model, from NGC.
build command (based on the documentation ):

riva-build speech_recognition \
/riva/stt_en_conformer_ctc_xlarge.rmir\
/nemo/stt_en_conformer_ctc_xlarge.riva \
–name=conformer-en-US-asr-offline
–featurizer.use_utterance_norm_params=False
–featurizer.precalc_norm_time_steps=0
–featurizer.precalc_norm_params=False
–ms_per_timestep=40
–nn.fp16_needs_obey_precision_pass
–vad.vad_start_history=200
–chunk_size=4.8
–left_padding_size=1.6
–right_padding_size=1.6
–max_batch_size=16
–decoder_type=greedy
–language_code=en-US

deploy command:

riva-deploy -f /data/rmir/stt_en_conformer_ctc_xlarge.rmir /data/models/

error log:

==========================
=== Riva Speech Skills ===

NVIDIA Release 22.05 (build 38626400)
Riva Speech Server Version 2.2.0
Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

https://developer.nvidia.com/tensorrt

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

To install Python sample dependencies, run /opt/tensorrt/python/python_setup.sh

To install the open-source samples corresponding to this TensorRT release version
run /opt/tensorrt/install_opensource.sh. To build the open source parsers,
plugins, and samples for current top-of-tree on master or a different branch,
run /opt/tensorrt/install_opensource.sh -b
See GitHub - NVIDIA/TensorRT: NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications. for more information.

[NeMo W 2022-06-07 12:56:58 optimizers:55] Apex was not found. Using the lamb or fused_adam optimizer will error out.
2022-06-07 12:56:58,172 [INFO] Writing Riva model repository to ‘/data/models/’…
2022-06-07 12:56:58,172 [INFO] The riva model repo target directory is /data/models/
2022-06-07 12:57:37,489 [INFO] Using obey-precision pass with fp16 TRT
2022-06-07 12:57:37,489 [INFO] Extract_binaries for nn → /data/models/riva-trt-conformer-en-US-asr-offline-am-streaming/1
2022-06-07 12:57:37,489 [INFO] extracting {‘onnx’: (‘nemo.collections.asr.models.ctc_bpe_models.EncDecCTCModelBPE’, ‘model_graph.onnx’)} → /data/models/riva-trt-conformer-en-US-asr-offline-am-streaming/1
2022-06-07 12:57:39,193 [INFO] Printing copied artifacts:
2022-06-07 12:57:39,193 [INFO] {‘onnx’: ‘/data/models/riva-trt-conformer-en-US-asr-offline-am-streaming/1/model_graph.onnx’}
2022-06-07 12:57:39,193 [INFO] Building TRT engine from ONNX file
[libprotobuf WARNING /home/jenkins/agent/workspace/OSS/OSS_L0_MergeRequest/oss/build/third_party.protobuf/src/third_party.protobuf/src/google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING /home/jenkins/agent/workspace/OSS/OSS_L0_MergeRequest/oss/build/third_party.protobuf/src/third_party.protobuf/src/google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 1682042606
[libprotobuf WARNING /home/jenkins/agent/workspace/OSS/OSS_L0_MergeRequest/oss/build/third_party.protobuf/src/third_party.protobuf/src/google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING /home/jenkins/agent/workspace/OSS/OSS_L0_MergeRequest/oss/build/third_party.protobuf/src/third_party.protobuf/src/google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 1682042606
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/07/2022-12:57:47] [TRT] [E] parsers/onnx/ModelImporter.cpp:780: While parsing node number 203 [Where → “1377”]:
[06/07/2022-12:57:47] [TRT] [E] parsers/onnx/ModelImporter.cpp:781: — Begin node —
[06/07/2022-12:57:47] [TRT] [E] parsers/onnx/ModelImporter.cpp:782: input: “1375”
input: “1376”
input: “1374”
output: “1377”
name: “Where_301”
op_type: “Where”

[06/07/2022-12:57:47] [TRT] [E] parsers/onnx/ModelImporter.cpp:783: — End node —
[06/07/2022-12:57:47] [TRT] [E] parsers/onnx/ModelImporter.cpp:785: ERROR: parsers/onnx/builtin_op_importers.cpp:4705 In function importWhere:
[8] Assertion failed: (x->getType() == y->getType() && x->getType() != nvinfer1::DataType::kBOOL) && “This version of TensorRT requires input x and y to have the same data type. BOOL is unsupported.”
2022-06-07 12:57:47,486 [INFO] Mixed-precision net: 482 layers, 482 tensors, 0 outputs…
2022-06-07 12:57:47,492 [INFO] Mixed-precision net: 0 layers / 0 outputs fixed
[06/07/2022-12:57:47] [TRT] [E] 4: [network.cpp::validate::2633] Error Code 4: Internal Error (Network must have at least one output)
[06/07/2022-12:57:47] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
2022-06-07 12:57:47,701 [INFO] Extract_binaries for featurizer → /data/models/conformer-en-US-asr-offline-feature-extractor-streaming/1
2022-06-07 12:57:47,702 [INFO] Extract_binaries for vad → /data/models/conformer-en-US-asr-offline-voice-activity-detector-ctc-streaming/1
2022-06-07 12:57:47,702 [INFO] extracting {‘vocab_file’: ‘/tmp/tmpbid6pnbc/riva_decoder_vocabulary.txt’} → /data/models/conformer-en-US-asr-offline-voice-activity-detector-ctc-streaming/1
2022-06-07 12:57:47,703 [INFO] Extract_binaries for lm_decoder → /data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming/1
2022-06-07 12:57:47,703 [INFO] extracting {‘vocab_file’: ‘/tmp/tmpbid6pnbc/riva_decoder_vocabulary.txt’, ‘tokenizer_model’: (‘nemo.collections.asr.models.ctc_bpe_models.EncDecCTCModelBPE’, ‘19196a05d50f48f68648bfd65f3fb6b0_tokenizer.model’)} → /data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming/1
2022-06-07 12:57:47,704 [INFO] {‘vocab_file’: ‘/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt’, ‘tokenizer_model’: ‘/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming/1/19196a05d50f48f68648bfd65f3fb6b0_tokenizer.model’}
2022-06-07 12:57:47,705 [INFO] Extract_binaries for conformer-en-US-asr-offline → /data/models/conformer-en-US-asr-offline/1
2022-06-07 12:57:47,705 [ERROR] Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/servicemaker/cli/deploy.py”, line 100, in deploy_from_rmir
generator.serialize_to_disk(
File “/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py”, line 427, in serialize_to_disk
RivaConfigGenerator.serialize_to_disk(self, repo_dir, rmir, config_only, verbose, overwrite)
File “/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py”, line 306, in serialize_to_disk
self.generate_config(version_dir, rmir)
File “/usr/local/lib/python3.8/dist-packages/servicemaker/triton/asr.py”, line 838, in generate_config
‘output_map’: {nn._outputs[0].name: ctc_inp_key},
IndexError: list index out of range

[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] ‘Shape tensor cast elision’ routine failed with: None

rvinobha · June 21, 2022, 4:10pm

Hi @balintcenturio

Thanks for your interest in Riva,

Apologies for the delay

Thanks for sharing the logs, I am checking with the team further regarding this issue/error, once I have an update I will reply back soon

Thanks for your patience

rvinobha · July 12, 2022, 8:24pm

Hi @balintcenturio

Thanks for your interest in Riva,

Apologies for the delay,

There were some issues in earlier version of riva specific to xlarge models,

Please try the above conversion using latest version 2.3.0, Using this version, you will be able to do the conversion,

Please use only 2.3.0 for the all above process, like nemo2riva conversion etc.

if you face any issues, or need additional information please let us know

Thanks

balintcenturio · July 13, 2022, 10:58am

Hi @rvinobha,

Thank you for your reply!

I have the same issue with the more recent version as well:

I used the 22.01 version NeMo container, and the 2.3.0 version of nemo2riva to create the riva file from the nemo file.
For further steps I used the 2.3.0 version of Riva as advised.

The error very similar, it mentions some ‘Where’ node assertion fail. Could it be the GPU, what I use? On paper the RTX A6000 is strong enough for these kind of tasks, but as I understand it is not the recommended hardware.

rvinobha · July 13, 2022, 11:11am

Hi @balintcenturio

Thanks for your interest in Riva,

Apologies you are facing issue, I will check further with the team,

Request to kindly share the command for which the error occurred and the complete logs of the command that causes error as txt file in this thread/post

Please also let us know the Nvidia Driver Version

Thanks

balintcenturio · July 13, 2022, 1:00pm

error.txt (8.0 KB)

Hi @rvinobha

Thank you for the quick response!

The command for which the error occured:

docker run --rm --gpus 1 \
      -v /home/user/riva_quickstart_v2.3.0/models_repo/:/data \
      nvcr.io/nvidia/riva/riva-speech:2.3.0-servicemaker -- \
      riva-deploy -f  /data/rmir/stt_en_conformer_ctc_large.rmir /data/models/

Nvidia driver version: 510.54

I also attached the complete error message as a txt file.

mk.faraz · October 17, 2022, 9:00am

Did you get to resolve the issue? Is A6000 campatible with RIVA?

balintcenturio · October 21, 2022, 11:33am

Hi @rvinobha!

Any update on this topic?

rvinobha · October 21, 2022, 11:39am

Hi @balintcenturio

Apologies for the delay,

Does this issue happen with our latest version 2.6.0

Thanks

yoav.ellinson · January 11, 2023, 12:38pm

I have the same issue on riva versions 2.8.0 and 2.8.1 when trying to deploy a finetuned AM in french.
Also using an A6000 GPU.

rvinobha · January 27, 2023, 6:14am

Hi @yoav.ellinson

Apologies for the delay

Please share the complete commands and log output with us for

riva-build
riva-deploy
config.sh
riva init.sh

Thanks

Topic		Replies	Views
Error in riva deployment Riva deployment aborted Riva ubuntu , nemo , riva	3	1116	February 27, 2023
Failed to convert Nemo model to Riva using nemo2riva for ASR Riva riva	1	60	January 24, 2025
Riva 1.8 riva_start.sh fail when build with language model Riva riva	3	1177	July 27, 2022
RIVA v2.15.0 fails to build NeMo model Riva	0	401	March 30, 2024
Encounter "Unsupported model IR version: 9, max supported IR version: 8" during deploy custom model in riva for TTS Riva onnx , riva	9	3397	January 22, 2024
Failed to deploy citrinet nemo to riva Riva riva	0	612	December 3, 2021
Riva model deployment issue Riva inception	8	1569	April 4, 2024
Issue Deploying Fine-Tuned Arabic Conformer Model in NVIDIA Riva: No Transcriptions Returned Riva	0	69	December 1, 2024
Getting error while instialaizing riva Riva installation , riva	5	1549	June 6, 2022
Failed to convert Nemo model to Riva (nemo2riva) - ASR Riva nemo	4	1198	May 31, 2023

RIVA error, when deploying official Conformer ASR network

========================== === Riva Speech Skills ===

Related topics

==========================
=== Riva Speech Skills ===