One of the processes has exited unexpectedly. Stopping container | CPU memory Leak

Hardware - GPU (T4)
Hardware - CPU
Operating System: Ubuntu 20.04
Riva Version 2.7.0

I’m trying to make inferences from the list of audio files. After taking inferences from several files it gives me this error

error: unable to run model inferencing: Stream has been closed.
error: unable to run model inferencing: Stream has been closed.
/opt/riva/bin/start-riva: line 55:    89 Killed ${CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --strict-model-config=true $model_repos --cuda-memory-pool-byte-size=0:1000000000
One of the processes has exited unexpectedly. Stopping container.

Thanks

Hi @mehadi.hasan

Thanks for your interest in Riva

Apologies your are facing the issue

This error looks like or seems to be indicative of an Cuda Out of Memory Issue,
Kindly request to monitor the nvidia-smi output and clear other services using GPU and let us know whether it helps

Thanks

Hi @rvinobha Thanks for your reply.

I could not find any Cuda Out of Memory Issue I log the GPU memory usages and see it never takes more than 50% of 16GB memory

But I found that the CPU RAM usages increased over time. Here is the code that I I’m using

import logging
import os
import wave

import riva.client
import riva.client.proto.riva_asr_pb2 as rasr
import riva.client.proto.riva_asr_pb2_grpc as rasr_srv
import riva.client.proto.riva_audio_pb2 as ra

logging.getLogger().setLevel(logging.INFO)

# init riva recognition stub
riva_server = "0.0.0.0:50051"
auth = riva.client.Auth(uri=riva_server)
stub = rasr_srv.RivaSpeechRecognitionStub(auth.channel)

# init riva recognition configuration
config = rasr.RecognitionConfig(
    encoding=ra.AudioEncoding.LINEAR_PCM,
    sample_rate_hertz=16000,
    language_code="<lang-code>",
    max_alternatives=1,
    enable_automatic_punctuation=False,
    audio_channel_count=1,
    enable_word_time_offsets=True,
)

def perform_infer(data_dir):
    audio_files  = utils.get_audio_files(data_dir)
    for i, audio_file in enumerate(audio_files):
        audio_file = str(audio_file)
        with wave.open(audio_file, "rb") as fp:
            wav_data = fp.readframes(-1)
            duration_minuts = (fp.getnframes() / fp.getframerate()) / 60
        logging.info(f"Taking inferannce from riva - audio duratio: {duration_minuts} minuts.")
        request = rasr.RecognizeRequest(config=config, audio=wav_data)

        # perfrom inferance on the response
        response = stub.Recognize(request)
        if len(response.results) > 0 and len(response.results[0].alternatives) > 0:
            outputs = response.results[0].alternatives[0]
            print(outputs)

if __name__ == "__main__":
    perform_batch_infer("path/to/my/data_dir")
  

Is there anything that I mess up in the above code? Does that cause a RAM leak?

Thanks

@rvinobha Here is the RAM monitoring screenshot

Initially, it was using 23% of RAM After running it around ~13 hr it was taking my 90% of my RAM
Screenshot from 2022-12-17 08-55-07

Thanks

Hi @mehadi.hasan

Thanks for your report,

What is the RAM capacity installed in your system

Thanks

@rvinobha 20 GB RAM capacity

@rvinobha I was thinking the RAM was taken by the client script that I give in above but It seems the memory was taken by the Riva container because when I kill the client script the memory usage was still the same.

Maybe there are some memory leak issues with Riva container

Thanks