Segfault and GPU memory overflow after activating all languages in RIVA for ASR

Hi,

Activating only one language, everything was working fine but I init riva docker version with riva_init.sh after having activating all the languages in the config.sh (I need only ASR):

service_enabled_asr=true
service_enabled_nlp=false
service_enabled_tts=false
language_code=(en-US en-GB de-DE es-US ru-RU zh-CN hi-IN fr-FR ko-KR pt-BR)

The init finish well after some hours of processing but when I start it with riva_start.sh, I get this error after about one minute :

cudaError_t 700 : "an illegal memory access was encountered" returned from 'cudaMalloc(&data, bytes)' in fileriva/utils/matrix/cu_vector.cc line 179'
cudaError_t 700 : "an illegal memory access was encountered" returned from 'cudaMalloc(&data, row_bytes * rows)' in fileexternal/cu-feat-extr/src/cudamatrix/cu-matrix.cu line 204'
/opt/riva/bin/start-riva: line 4:   103 Segmentation fault      (core dumped) ${CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --strict-model-config=true $model_repos --cuda-memory-pool-byte-size=0:1000000000

It had time to load some models but it hadn’t enough memory to load all of them. I monitored the GPU memory with nivida-smi and I can confirm that the memory load exceed near the 16GB of the card before the crash.

Is there a way to limit the memory used by tritonserver (it looks to be the job of --cuda-memory-pool-byte-size but it doesn’t work)?

Is there a way to load a subpart of the models only? (I don’t need all the languages on each docker container) I only see a parameter for the model repos and I don’t know which one are needed per languages.

Best regards,

Hardware - GPU A2
Hardware - CPU Intel(R) Xeon(R) E-2388G CPU @ 3.20GHz
Operating System Ubuntu 22.04
Riva Version 2.7
TLT Version (if relevant)
How to reproduce the issue ? described above

HI @paentraygues

Thanks for your interest in Riva

I am not very sure about limiting the memory, i will need to check with my team and let you know

Request you to kindly share the config.sh used to answer the second part of the query

Thanks

Hi @rvinobha and thank you for your answer,

I only modified those 4 lines from the original config.sh file :

service_enabled_asr=true
service_enabled_nlp=false
service_enabled_tts=false
language_code=(en-US en-GB de-DE es-US ru-RU zh-CN hi-IN fr-FR ko-KR pt-BR)

Here is the full config.sh : # Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.## NVIDIA COR - Pastebin.com

Thanks

Hello,

I have similar problem:

/opt/riva/bin/start-riva: line 4: 110 Segmentation fault ${CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --strict-model-config=true $model_repos --cuda-memory-pool-byte-size=0:1000000000

Triton server died before reaching ready state. Terminating Riva startup.

I am using WSL 2 with Windows Nvidia Driver 471.68.

Attached docker log file and config file.

seg_log.txt.txt (150.2 KB)

config.sh (9.8 KB)

looks problem is solved by reinstalling latest device driver