One of the processes has exited unexpectedly. Stopping container | CPU memory Leak

mehadi.hasan · December 15, 2022, 4:40pm

Hardware - GPU (T4)
Hardware - CPU
Operating System: Ubuntu 20.04
Riva Version 2.7.0

I’m trying to make inferences from the list of audio files. After taking inferences from several files it gives me this error

error: unable to run model inferencing: Stream has been closed.
error: unable to run model inferencing: Stream has been closed.
/opt/riva/bin/start-riva: line 55:    89 Killed ${CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --strict-model-config=true $model_repos --cuda-memory-pool-byte-size=0:1000000000
One of the processes has exited unexpectedly. Stopping container.

Thanks

rvinobha · December 16, 2022, 6:08am

Hi @mehadi.hasan

Thanks for your interest in Riva

Apologies your are facing the issue

This error looks like or seems to be indicative of an Cuda Out of Memory Issue,
Kindly request to monitor the nvidia-smi output and clear other services using GPU and let us know whether it helps

Thanks

mehadi.hasan · December 16, 2022, 11:57am

Hi @rvinobha Thanks for your reply.

I could not find any Cuda Out of Memory Issue I log the GPU memory usages and see it never takes more than 50% of 16GB memory

But I found that the CPU RAM usages increased over time. Here is the code that I I’m using

import logging
import os
import wave

import riva.client
import riva.client.proto.riva_asr_pb2 as rasr
import riva.client.proto.riva_asr_pb2_grpc as rasr_srv
import riva.client.proto.riva_audio_pb2 as ra

logging.getLogger().setLevel(logging.INFO)

# init riva recognition stub
riva_server = "0.0.0.0:50051"
auth = riva.client.Auth(uri=riva_server)
stub = rasr_srv.RivaSpeechRecognitionStub(auth.channel)

# init riva recognition configuration
config = rasr.RecognitionConfig(
    encoding=ra.AudioEncoding.LINEAR_PCM,
    sample_rate_hertz=16000,
    language_code="<lang-code>",
    max_alternatives=1,
    enable_automatic_punctuation=False,
    audio_channel_count=1,
    enable_word_time_offsets=True,
)

def perform_infer(data_dir):
    audio_files  = utils.get_audio_files(data_dir)
    for i, audio_file in enumerate(audio_files):
        audio_file = str(audio_file)
        with wave.open(audio_file, "rb") as fp:
            wav_data = fp.readframes(-1)
            duration_minuts = (fp.getnframes() / fp.getframerate()) / 60
        logging.info(f"Taking inferannce from riva - audio duratio: {duration_minuts} minuts.")
        request = rasr.RecognizeRequest(config=config, audio=wav_data)

        # perfrom inferance on the response
        response = stub.Recognize(request)
        if len(response.results) > 0 and len(response.results[0].alternatives) > 0:
            outputs = response.results[0].alternatives[0]
            print(outputs)

if __name__ == "__main__":
    perform_batch_infer("path/to/my/data_dir")

Is there anything that I mess up in the above code? Does that cause a RAM leak?

Thanks

mehadi.hasan · December 17, 2022, 3:06am

@rvinobha Here is the RAM monitoring screenshot

Initially, it was using 23% of RAM After running it around ~13 hr it was taking my 90% of my RAM
Screenshot from 2022-12-17 08-55-07

Thanks

rvinobha · December 19, 2022, 6:09pm

Hi @mehadi.hasan

Thanks for your report,

What is the RAM capacity installed in your system

Thanks

mehadi.hasan · December 20, 2022, 7:31am

@rvinobha 20 GB RAM capacity

mehadi.hasan · December 20, 2022, 7:36am

@rvinobha I was thinking the RAM was taken by the client script that I give in above but It seems the memory was taken by the Riva container because when I kill the client script the memory usage was still the same.

Maybe there are some memory leak issues with Riva container

Thanks

sushant.mhambrey · June 19, 2024, 12:47pm

Hi @mehadi.hasan
Did you find anything regarding this later?

mehadi.hasan · June 23, 2024, 3:31am

Not yet

Topic		Replies	Views
Nvidia RIVA - 2.6.0 gettting stuck after some time. Giving timeout error after sometime of inferencing Riva	5	729	December 19, 2022
Riva and Triton thread leak and consequent memory leak Riva riva	2	379	June 19, 2024
RIVA server turns off immediately Riva	2	632	November 14, 2023
Jetson orin nano 4G riva fail Riva	2	568	January 8, 2024
Riva_server fails if not all Triton models are loaded Riva	6	1058	January 27, 2023
NVIDIA RIVA ASR going down frequently for en-US Riva	2	466	October 21, 2024
Riva-speech container failed to start Riva	3	616	April 6, 2023
Jetson Orin Nano riva fail Jetson Orin Nano riva	7	749	January 30, 2024
Error Code 2: OutOfMemory (no further information) Riva ubuntu , riva	9	1866	September 30, 2022
Riva quickstart 2.18 nmt commands giving error: Failed to open the cudaIpcHandle Riva	0	43	January 27, 2025

One of the processes has exited unexpectedly. Stopping container | CPU memory Leak

Related topics