Unable to start en-GB model for ASR

kuba · August 2, 2023, 1:59pm

Hardware - GPU A10

Question: How to run only one model using riva quickstart? I want to run the en-GB model without the en-US model.

Details:

I can successfully start riva ASR for en-US model. I have been trying for couple of days to start riva with the en-GB model and am not able to accomplish this.
I am using the riva quickstart repo 2.12.1 ( I have also tried with 2.11.0).

Steps I am taking:

download the quickstart repo
edit the config.sh to include:

service_enabled_asr=true
service_enabled_nlp=false
service_enabled_tts=false
service_enabled_nmt=false

language_code=("en-GB")

I do not edit anything else.

bash riva_init.sh → This completes successfully
bash riva_start.sh → This fails.
The command runs for many many minutes, here is interesting parts from the docker logs riva-speech

I0802 13:31:12.401523 102 model_lifecycle.cc:693] successfully loaded 'conformer-en-GB-asr-offline-ctc-decoder-cpu-streaming-offline' version 1

I0802 13:31:12.502004 102 model_lifecycle.cc:693] successfully loaded 'conformer-en-GB-asr-streaming-endpointing-streaming' version 1

I0802 13:31:13.412781 102 endpointing_library.cc:20] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-offline-endpointing-streaming-offline (version 1)

I0802 13:31:13.516893 102 model_lifecycle.cc:693] successfully loaded 'conformer-en-GB-asr-offline-endpointing-streaming-offline' version 1
I0802 13:31:20.525963 102 pipeline_library.cc:28] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-GB_0 (device 0)
I0802 13:31:20.526448 102 model_lifecycle.cc:693] successfully loaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1
I0802 13:31:20.540559 102 pipeline_library.cc:28] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0)
I0802 13:31:20.540669 102 model_lifecycle.cc:693] successfully loaded 'riva-punctuation-en-GB' version 1
I0802 13:31:20.554729 102 feature-extractor.cc:417] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-feature-extractor-streaming_0 (device 0)
I0802 13:31:20.554864 102 model_lifecycle.cc:693] successfully loaded 'riva-punctuation-en-US' version 1
I0802 13:31:20.561134 102 feature-extractor.cc:417] TRITONBACKEND_ModelInstanceInitialize: conformer-en-GB-asr-offline-feature-extractor-streaming-offline_0 (device 0)
I0802 13:31:20.562079 102 model_lifecycle.cc:693] successfully loaded 'conformer-en-US-asr-streaming-feature-extractor-streaming' version 1
I0802 13:31:20.592794 102 feature-extractor.cc:417] TRITONBACKEND_ModelInstanceInitialize: conformer-en-GB-asr-streaming-feature-extractor-streaming_0 (device 0)
I0802 13:31:20.593456 102 model_lifecycle.cc:693] successfully loaded 'conformer-en-GB-asr-offline-feature-extractor-streaming-offline' version 1
I0802 13:31:20.599374 102 tensorrt.cc:5627] TRITONBACKEND_ModelInstanceInitialize: riva-trt-conformer-en-GB-asr-offline-am-streaming-offline_0 (GPU device 0)
I0802 13:31:20.599982 102 model_lifecycle.cc:693] successfully loaded 'conformer-en-GB-asr-streaming-feature-extractor-streaming' version 1
> Riva waiting for Triton server to load all models...retrying in 1 second
I0802 13:31:21.243123 102 logging.cc:49] Loaded engine size: 353 MiB
I0802 13:31:21.507173 102 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 5000, GPU 18976 (MiB)
I0802 13:31:21.508489 102 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +331, now: CPU 0, GPU 331 (MiB)
I0802 13:31:21.541547 102 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 4293, GPU 18976 (MiB)
> Riva waiting for Triton server to load all models...retrying in 1 second
I0802 13:31:22.909362 102 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +333, now: CPU 0, GPU 664 (MiB)
W0802 13:31:22.909394 102 logging.cc:46] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
> Riva waiting for Triton server to load all models...retrying in 1 second


I0802 13:31:23.103711 102 tensorrt.cc:1547] Created instance riva-trt-conformer-en-GB-asr-offline-am-streaming-offline_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0802 13:31:25.986923 102 tensorrt.cc:1547] Created instance riva-trt-conformer-en-GB-asr-streaming-am-streaming_0 on GPU 0 with stream priority 0 and optimization profile default[0];


E0802 13:31:27.654518 102 logging.cc:43] 1: [graphContext.h::MyelinGraphContext::27] Error Code 1: Myelin (CUDA error 2 failed to create CUDA stream )
E0802 13:31:27.654751 102 model_lifecycle.cc:596] failed to load 'riva-trt-conformer-en-US-asr-offline-am-streaming-offline' version 1: Internal: unable to create TensorRT context

W0802 13:31:29.059793 102 logging.cc:46] Requested amount of GPU memory (707424256 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.

Ah, now that I have properly read the log I can I am running out of memory.
I see it is loading the en-US model as well, how can I prevent that?

The en-US model by itself fits in my 24GB of memory. (I only have 16GB free, I need to run other processes in the GPU in parallel).

kuba · August 2, 2023, 3:24pm

The answer was that the docker service tries to run all the models in the model dir. I had some en-US models there from previous runs.

system · August 16, 2023, 3:24pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Riva_start.sh will not load the models Riva riva	3	1243	April 23, 2024
How can I start Riva without an error Riva riva	7	2585	September 29, 2021
Getting error while instialaizing riva Riva installation , riva	5	1578	June 6, 2022
Riva 1.8 riva_start.sh fail when build with language model Riva riva	3	1199	July 27, 2022
Run init_start.sh failed Riva riva	1	1288	April 12, 2022
Triton server died before reaching ready state. Terminating Riva startup Riva	15	7842	November 8, 2023
Failed to get riva started Riva riva	7	1779	December 3, 2022
"out of memory" error when run riva_start.sh Riva cublas	4	104	August 1, 2025
Riva_start.sh will not start the server Riva riva	4	1163	August 31, 2023
Riva 2.0 Embedded : Only en-US model downloaded default Riva riva	3	669	June 16, 2022

Unable to start en-GB model for ASR

Related topics