Riva and Triton thread leak and consequent memory leak

Hardware - GPU:
T4

Hardware - CPU:
x86_64
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
4 CPU

Operating System:
Ubuntu
Linux 5.10.213-201.855.amzn2.x86_64

Riva Version:

$ ./bin/riva_server --version
riva_server version 2.15.0

Triton Version:
server_version | 2.40.0

There is Riva deployed in EKS, and we observe high memory usage during and after our load test…

the count of threads before the test run:

root@riva-api-en-primary-5d94766b8f-tczqs:/opt/riva# ps auxwwH | grep riva_server | wc -l
25
root@riva-api-en-primary-5d94766b8f-tczqs:/opt/riva# ps auxwwH | grep tritonserver | wc -l
70

mem:

    PID     TID  MINFLT  MAJFLT VSTEXT  VSLIBS   VDATA VSTACK  LOCKSZ   VSIZE  RSIZE   PSIZE   VGROW  RGROW  SWAPSZ  RUID     EUID       MEM  CMD        1/1
     21       -  334320     180   8.8M    1.8G    1.9G 364.0K    0.0K   41.5G   1.4G      0B   41.5G   1.4G      0B  root     root        9%  tritonserver
     95       -    4036       0  11.6M   12.7M  349.5M 172.0K    0.0K  388.2M  28.0M      0B  388.2M  28.0M      0B  root     root        0%  riva_server

the count of threads after the test run:

root@riva-api-en-primary-6749f8678f-ng6zs:/opt/riva/bin# ps auxwwH | grep riva_server |  wc -l
224
root@riva-api-en-primary-6749f8678f-ng6zs:/opt/riva/bin# ps auxwwH | grep tritonserver |  wc -l
79

mem:

    PID       TID   MINFLT   MAJFLT   VSTEXT    VSLIBS    VDATA   VSTACK   LOCKSZ     VSIZE    RSIZE    PSIZE    VGROW    RGROW    SWAPSZ   RUID       EUID        MEM    CMD        1/1
     95         -        0        0    11.6M     12.7M     5.5G   172.0K     0.0K      5.5G     3.3G       0B       0B       0B        0B   root       root        21%    riva_server
     21         -        0        0     8.8M      1.8G     2.3G   364.0K     0.0K     41.9G     2.0G       0B       0B       0B        0B   root       root        13%    tritonserver

the load test is:
# riva_streaming_asr_client --chunk_duration_ms=20 --simulate_realtime=true --automatic_punctuation=true --num_parallel_requests=160 --word_time_offsets=true --print_transcripts=false --interim_results=false --simulate_realtime=true --num_iterations=340 --audio_file=/sample.wav --output_filename=/tmp/output.json --riva_uri=10.18.12.209:50051

the test log:

I0424 16:06:28.589784  5120 riva_streaming_asr_client.cc:150] Using Insecure Server Credentials
Loading eval dataset...
filename: /sample.wav
Done loading 1 files
Not printing latency statistics because the client is run without the --simulate_realtime option and/or the number of requests sent is not equal to number of requests received. To get latency statistics, run with --simulate_realtime and set the --chunk_duration_ms to be the same as the server chunk duration
Run time: 516.921 sec.
Total audio processed: 58565.3 sec.
Throughput: 113.297 RTFX

after second test run:

# riva_streaming_asr_client --chunk_duration_ms=20 --simulate_realtime=true --automatic_punctuation=true --num_parallel_requests=160 --word_time_offsets=true --print_transcripts=false --interim_results=false --simulate_realtime=true --num_iterations=340 --audio_file=/sample.wav    --output_filename=/tmp/output.json --riva_uri=10.18.12.209:50051
I0424 16:20:04.118997  5767 riva_streaming_asr_client.cc:150] Using Insecure Server Credentials
Loading eval dataset...
filename: /sample.wav
Done loading 1 files
Not printing latency statistics because the client is run without the --simulate_realtime option and/or the number of requests sent is not equal to number of requests received. To get latency statistics, run with --simulate_realtime and set the --chunk_duration_ms to be the same as the server chunk duration
Run time: 516.895 sec.
Total audio processed: 58565.3 sec.
Throughput: 113.302 RTFX

the count of threads:

root@riva-api-en-primary-6749f8678f-ng6zs:/opt/riva/bin# ps auxwwH | grep riva_server |  wc -l
286
root@riva-api-en-primary-6749f8678f-ng6zs:/opt/riva/bin# ps auxwwH | grep tritonserver |  wc -l
79

mem:

    PID       TID   MINFLT   MAJFLT   VSTEXT    VSLIBS    VDATA   VSTACK   LOCKSZ     VSIZE    RSIZE    PSIZE    VGROW    RGROW    SWAPSZ   RUID       EUID        MEM    CMD        1/1
     95         -   1703e3        0    11.6M     12.7M     9.3G   172.0K     0.0K      9.3G     6.4G       0B     9.3G     6.4G        0B   root       root        42%    riva_server
     21         -   617533      188     8.8M      1.8G     2.8G   364.0K     0.0K     42.4G     2.4G       0B    42.4G     2.4G        0B   root       root        16%    tritonserver

Hi @alexander.petrovsky

Thanks for your interest in Riva

Please report the issue with Triton Team by filing an issue with below link

Thanks