Riva ASR quickstart throws cudaError: "an illegal memory access was encountered"

Environment:

  • Hardware - GPU A100
  • Hardware - CPU Intel(R) Xeon(R) Gold 6342 CPU @ 2.80GHz
  • Operating System - Ubuntu 20.04.5 LTS
  • Riva Version - 2.10.0
  • NVidia driver version: 525.60.13
  • CUDA version: 12.0
  • Docker version: 23.0.1

Steps to reproduce:

bash riva_init.sh
bash riva_start.sh

And start producing any ASR streaming requests from client (nvidia-riva-client==2.10.0)

Results
Riva server can’t process any ASR requests and throwing a lot of errors like the following:

cudaError_t 700 : "an illegal memory access was encountered" returned from 'cudaMemset2DAsync( data_.get(), stride_ * sizeof(Real), 0, num_cols_ * sizeof(Real), num_rows_, cudaStreamPerThread)' in fileriva/utils/matrix/cu_matrix.cc line 122'

I also tried setting gpus_to_use=“all”, but nothing changed.
I want to note that this problem occurs with Riva versions starting from 2.8.0, while 2.7.0 works without any issues with the same configuration.

Hope these files could help during the investigation.

config.sh (12.7 KB)
nvidia-smi.txt (2.3 KB)
riva_init.log (150.2 KB)
riva_speech.log (2.0 MB)
riva_start.log (265 Bytes)

Any help on this would be much appreciated.
Thanks in advance!

Hi @vbilous

Thanks for your interest in Riva

Apologies you are facing issue,
Thanks for sharing the logs, I will check with the Riva team and provide updates

Thanks

We are also tried to run Riva on a different VM, but the result is still the same.

Environment:

  • Hardware - GPU A40
  • Hardware - CPU Intel(R) Xeon(R) Gold 6354 CPU @ 3.00GHz
  • Operating System - Ubuntu 20.04.5 LTS
  • Riva Version - 2.11.0
  • NVidia driver version: 525.105.17
  • CUDA version: 12.0
  • Docker version: 23.0.1

config.sh (13.3 KB)
nvidia-smi.log (1.5 KB)
riva_init.log (171.5 KB)
riva_speech.log (365.4 KB)
riva_start.log (333 Bytes)

Thanks in advance

HI @vbilous

Sincere Apologies for the delay,

I have not reached the complete triage and solution,

But i have a point to test

The Current CUDA version you have at your end is 12.0

Can you kindly downgrade to CUDA 11.8 and check
Riva works with CUDA 11.8

Please try and let us know, while i get more information

Thanks