Parakeet_1.1b + streaming diarisation VRAM leakage

serhii-artemuk · April 4, 2025, 5:19pm

Hardware - GPU (A10G 24 Gb)
Hardware - CPU (32 vCPUs)
Operating System - Linux/Ubuntu
Riva Version - 2.19

Hi. I’m currently testing streaming diarization with Riva quick start and their default configuration. I’m running performance tests and loading Riva with a different number of parallel requests. I noticed that when the RIVA sevres is up, it takes up 11.6 GB of VRAM. If I increase the number of requests to 85, the VRAM usage increases to 13.35 Gb. I hadn’t noticed such a leak before, and RIVA didn’t exceed the reserved memory limit. Can anyone tell me if this is the way it should be or if it can be avoided somehow?

To deploy the service, I use the original riva quick start scripts: Riva Skills Quick Start | NVIDIA NGC

In the config.sh i changed following lines:

asr_acoustic_model=("parakeet_1.1b")
asr_language_code=("en-US")
asr_accessory_model=("diarizer")
use_asr_greedy_decoder=false

I perform tests according to this guide: Performance — NVIDIA Riva

sophwats · April 10, 2025, 11:55am

Hi @serhii-artemuk memory usage is expected to increase with higher parallel requests.

Best,

Sophie

Topic		Replies	Views
Riva: Node.JS Examples Riva riva	4	2209	April 15, 2022
Fail riva-client offline when try to enable diarization Riva	7	1072	July 25, 2024
Riva v2.19 speaker diarization issue Riva riva	2	27	April 10, 2025
No streaming/live transcription feature for Whisper on the Riva? Riva	5	66	January 27, 2025
Riva and Triton thread leak and consequent memory leak Riva riva	2	377	June 19, 2024
Nvidia RIVA - 2.6.0 gettting stuck after some time. Giving timeout error after sometime of inferencing Riva	5	728	December 19, 2022
Jarvis ASR batch mode file size exceeds Riva riva	1	797	May 3, 2021
Streaming Inference fails intermittently with error: must specify the START flag on the first request of the sequence Riva	7	1279	July 28, 2024
Addressing Memory Issues with Nvidia Riva ASR Riva	2	613	November 10, 2023
Diarization - Titanet / ecapa_tdnn / VAD - roadmap Riva inception	12	1703	December 6, 2022

Parakeet_1.1b + streaming diarisation VRAM leakage

Related topics