RIVA ASR not working in ESXi8 environment

mklee1 · April 21, 2023, 2:02am

Please provide the following information when requesting support.

Hardware - GPU (A100/A30/T4/V100) - grid_a100-20c
Hardware - CPU - AMD EPYC 7352 24-Core Processor
Operating System / VMware ESXi, 8.0.0, 21203435 - Virtual Machine(Ubuntu 20.04)
Riva Version - riva_quickstart:2.10.0
TLT Version (if relevant)
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)

Hi all.

I’m trying to test ASR,TTS features etc with Nvidia riva quick start guide.

I downloaded the datacenter riva_quirck_start:2.10.0 version and set only asr,nmt services to true in config.sh as the guide says.

I only enabled the German translation part of NMT, the rest is the same as default.

Running riva_start.sh works fine.

Then I wrote and executed the guide code for ASR test, but I get the following error.

------------------------------------ Error ----------------------------------------------

{
“name”: “_InactiveRpcError”,
“message”: “<InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNKNOWN\n\tdetails = "in ensemble ‘conformer-en-US-asr-offline’, audio_signal: failed to perform CUDA copy: an illegal memory access was encountered"\n\tdebug_error_string = "UNKNOWN:Error received from peer {grpc_message:"in ensemble \‘conformer-en-US-asr-offline\’, audio_signal: failed to perform CUDA copy: an illegal memory access was encountered", grpc_status:2, created_time:"2023-04-21T01:03:01.91677218+00:00"}"\n>",
“stack”: "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31m_InactiveRpcError\u001b[0m Traceback (most recent call last)\nCell \u001b[1;32mIn[13], line 1\u001b[0m\n\u001b[1;32m----> 1\u001b[0m response \u001b[39m=\u001b[39m riva_asr\u001b[39m.\u001b[39;49moffline_recognize(content, config)\n\u001b[0;32m 2\u001b[0m asr_best_transcript \u001b[39m=\u001b[39m response\u001b[39m.\u001b[39mresults[\u001b[39m0\u001b[39m]\u001b[39m.\u001b[39malternatives[\u001b[39m0\u001b[39m]\u001b[39m.\u001b[39mtranscript\n\u001b[0;32m 3\u001b[0m \u001b[39mprint\u001b[39m(\u001b[39m"\u001b[39m\u001b[39mASR Transcript:\u001b[39m\u001b[39m"\u001b[39m, asr_best_transcript)\n\nFile \u001b[1;32m~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\riva\client\asr.py:362\u001b[0m, in \u001b[0;36mASRService.offline_recognize\u001b[1;34m(self, audio_bytes, config, future)\u001b[0m\n\u001b[0;32m 360\u001b[0m request \u001b[39m=\u001b[39m rasr\u001b[39m.\u001b[39mRecognizeRequest(config\u001b[39m=\u001b[39mconfig, audio\u001b[39m=\u001b[39maudio_bytes)\n\u001b[0;32m 361\u001b[0m func \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mstub\u001b[39m.\u001b[39mRecognize\u001b[39m.\u001b[39mfuture \u001b[39mif\u001b[39;00m future \u001b[39melse\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mstub\u001b[39m.\u001b[39mRecognize\n\u001b[1;32m–> 362\u001b[0m \u001b[39mreturn\u001b[39;00m func(request, metadata\u001b[39m=\u001b[39;49m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mauth\u001b[39m.\u001b[39;49mget_auth_metadata())\n\nFile \u001b[1;32m~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\grpc\channel.py:1030\u001b[0m, in \u001b[0;36m_UnaryUnaryMultiCallable.call\u001b[1;34m(self, request, timeout, metadata, credentials, wait_for_ready, compression)\u001b[0m\n\u001b[0;32m 1021\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39m__call\u001b[39m(\u001b[39mself\u001b[39m,\n\u001b[0;32m 1022\u001b[0m request: Any,\n\u001b[0;32m 1023\u001b[0m timeout: Optional[\u001b[39mfloat\u001b[39m] \u001b[39m=\u001b[39m \u001b[39mNone\u001b[39;00m,\n\u001b[1;32m (…)\u001b[0m\n\u001b[0;32m 1026\u001b[0m wait_for_ready: Optional[\u001b[39mbool\u001b[39m] \u001b[39m=\u001b[39m \u001b[39mNone\u001b[39;00m,\n\u001b[0;32m 1027\u001b[0m compression: Optional[grpc\u001b[39m.\u001b[39mCompression] \u001b[39m=\u001b[39m \u001b[39mNone\u001b[39;00m) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m Any:\n\u001b[0;32m 1028\u001b[0m state, call, \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_blocking(request, timeout, metadata, credentials,\n\u001b[0;32m 1029\u001b[0m wait_for_ready, compression)\n\u001b[1;32m-> 1030\u001b[0m \u001b[39mreturn\u001b[39;00m _end_unary_response_blocking(state, call, \u001b[39mFalse\u001b[39;49;00m, \u001b[39mNone\u001b[39;49;00m)\n\nFile \u001b[1;32m~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\grpc\_channel.py:910\u001b[0m, in \u001b[0;36m_end_unary_response_blocking\u001b[1;34m(state, call, with_call, deadline)\u001b[0m\n\u001b[0;32m 908\u001b[0m \u001b[39mreturn\u001b[39;00m state\u001b[39m.\u001b[39mresponse\n\u001b[0;32m 909\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m–> 910\u001b[0m \u001b[39mraise\u001b[39;00m _InactiveRpcError(state)\n\n\u001b[1;31m_InactiveRpcError\u001b[0m: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNKNOWN\n\tdetails = "in ensemble ‘conformer-en-US-asr-offline’, audio_signal: failed to perform CUDA copy: an illegal memory access was encountered"\n\tdebug_error_string = "UNKNOWN:Error received from peer {grpc_message:"in ensemble \‘conformer-en-US-asr-offline\’, audio_signal: failed to perform CUDA copy: an illegal memory access was encountered", grpc_status:2, created_time:"2023-04-21T01:03:01.91677218+00:00"}"\n>”
}

Can you tell me if this is a CUDA version related issue?

If it is a CUDA version issue, how can I resolve it?

After executing riva_start_client.sh, the NMT Guide Code works normally.

python3 /opt/riva/examples/nmt.py --model-name=en_de_24x6 --src-language=en --tgt-language=de --text=“I love you.” → Ich liebe dich.

Thanks for reading and have a great day.

rvinobha · April 24, 2023, 6:16pm

Hi @mklee1

Thanks for your interest in Riva

Request to share the complete log output of docker logs riva-speech in this thread

Quick doubt, do we have multiple GPUs present in your machine, if yes can we try running only using a single GPU

Thanks

mklee1 · April 26, 2023, 1:28am

Hi @rvinobha

First of all, thank you for your reply.

I am attaching the docker logs riva-speech log as a txt file.

riva-speech-logs.txt (2.2 KB)

I am using Single GPU for the VM.

I am attaching the nvidia-smi screenshot.

Thanks a lot.

Have a nice day.

mklee1 · April 26, 2023, 7:49am

I saw a similar inquiry on the forum.

I downgraded riva to 2.7.0 version and ASR seems to be working fine.

As for the log file, now that I look at it, it seems to be a miscommunication.

Thanks.

rvinobha · April 27, 2023, 8:13am

Hi @mklee1

Thanks for proactively trying 2.7 and finding out it works, we will try to find why it didn’t work in 2.10

Thanks for sharing the logs,

Apologies, from the logs captured, I can find the riva-start has failed,

Can you share the complete log output of riva-init to find some clue regarding the failure, as docker logs riva-speech currently shared does not have any details

Also when running riva-start, simultaenously can you parallelly run docker logs riva-speech simultaneously in another window and reshare again

Thanks

system · May 11, 2023, 8:13am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Riva 2.0 ASR not working Riva	2	884	May 18, 2022
NVIDIA RIVA ASR going down frequently for en-US Riva	2	511	October 21, 2024
Segmentation fault Error while runing riva_start.sh Riva	2	450	June 28, 2023
Riva ASR quickstart throws cudaError: "an illegal memory access was encountered" Riva riva	7	1274	October 14, 2023
Unable to start riva Riva	6	1693	March 12, 2022
Riva ASR issue on transcribing demo audio Riva riva	3	652	April 25, 2023
Can´t start riva Riva	1	1205	April 5, 2022
"out of memory" error when run riva_start.sh Riva cublas	4	104	August 1, 2025
How can I start Riva without an error Riva riva	7	2585	September 29, 2021
Riva quickstart 2.11 fails on xavier nx Riva	3	953	June 29, 2023

RIVA ASR not working in ESXi8 environment

Related topics