Hardware - GPU (T4)
Hardware - CPU
Operating System: Ubuntu 20.04
Riva Version 2.7.0
I’m trying to make inferences from the list of audio files. After taking inferences from several files it gives me this error
error: unable to run model inferencing: Stream has been closed.
error: unable to run model inferencing: Stream has been closed.
/opt/riva/bin/start-riva: line 55: 89 Killed ${CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --strict-model-config=true $model_repos --cuda-memory-pool-byte-size=0:1000000000
One of the processes has exited unexpectedly. Stopping container.
Thanks
Hi @mehadi.hasan
Thanks for your interest in Riva
Apologies your are facing the issue
This error looks like or seems to be indicative of an Cuda Out of Memory Issue,
Kindly request to monitor the nvidia-smi output and clear other services using GPU and let us know whether it helps
Thanks
Hi @rvinobha Thanks for your reply.
I could not find any Cuda Out of Memory Issue I log the GPU memory usages and see it never takes more than 50% of 16GB memory
But I found that the CPU RAM usages increased over time. Here is the code that I I’m using
import logging
import os
import wave
import riva.client
import riva.client.proto.riva_asr_pb2 as rasr
import riva.client.proto.riva_asr_pb2_grpc as rasr_srv
import riva.client.proto.riva_audio_pb2 as ra
logging.getLogger().setLevel(logging.INFO)
# init riva recognition stub
riva_server = "0.0.0.0:50051"
auth = riva.client.Auth(uri=riva_server)
stub = rasr_srv.RivaSpeechRecognitionStub(auth.channel)
# init riva recognition configuration
config = rasr.RecognitionConfig(
encoding=ra.AudioEncoding.LINEAR_PCM,
sample_rate_hertz=16000,
language_code="<lang-code>",
max_alternatives=1,
enable_automatic_punctuation=False,
audio_channel_count=1,
enable_word_time_offsets=True,
)
def perform_infer(data_dir):
audio_files = utils.get_audio_files(data_dir)
for i, audio_file in enumerate(audio_files):
audio_file = str(audio_file)
with wave.open(audio_file, "rb") as fp:
wav_data = fp.readframes(-1)
duration_minuts = (fp.getnframes() / fp.getframerate()) / 60
logging.info(f"Taking inferannce from riva - audio duratio: {duration_minuts} minuts.")
request = rasr.RecognizeRequest(config=config, audio=wav_data)
# perfrom inferance on the response
response = stub.Recognize(request)
if len(response.results) > 0 and len(response.results[0].alternatives) > 0:
outputs = response.results[0].alternatives[0]
print(outputs)
if __name__ == "__main__":
perform_batch_infer("path/to/my/data_dir")
Is there anything that I mess up in the above code? Does that cause a RAM leak?
Thanks
@rvinobha Here is the RAM monitoring screenshot
Initially, it was using 23% of RAM After running it around ~13 hr it was taking my 90% of my RAM

Thanks
Hi @mehadi.hasan
Thanks for your report,
What is the RAM capacity installed in your system
Thanks
@rvinobha 20 GB RAM capacity
@rvinobha I was thinking the RAM was taken by the client script that I give in above but It seems the memory was taken by the Riva container because when I kill the client script the memory usage was still the same.
Maybe there are some memory leak issues with Riva container
Thanks
Hi @mehadi.hasan
Did you find anything regarding this later?