Getting error while instialaizing riva

Showing error below while running “sudo bash riva_init.sh”

KeyError: 'wfst_model_dir'

+ [[ amd64 == \a\r\m\6\4 ]]

After that, showing output below while running “sudo bash riva_start.sh”

Waiting for Riva server to load all models…retrying in 10 seconds
.
.
.
Waiting for Riva server to load all models…retrying in 10 seconds
Health ready check failed.
Check Riva logs with: docker logs riva-speech

Please, help me to solve this issue!

Hardware - GPU - NVIDIA Corporation GA102 [GeForce RTX 3090]
*Hardware - CPU -11th Gen Intel®? Core?? i7-11700K @ 3.60GHz ×? 16 *
Operating System -Ubuntu 21.04, 64-bit
Riva Version - 2.1.0

Hi @rasel

Thanks for your interest in Riva,

Have a quick suggestion,

Can you run

  1. First bash riva_clean.sh and then bash riva_init.sh and check issue persists
  2. If the Issue still persists, please send me the complete log output of command bash riva_init.sh as a file in this forum thread and the config.sh used

Thank you for the reply.

Tried " 1. First bash riva_clean.sh and then bash riva_init.sh but not solved yet!

*Here, I have attached the complete log output of bash riva_init.sh ( output_riva_init.log )
output_riva_init.log (6.8 MB)

config.sh

# Copyright (c) 2022, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

# Architecture of target platform. Supported architectures: amd64, arm64
riva_target_arch="amd64"

# Enable or Disable Riva Services
service_enabled_asr=true
service_enabled_nlp=true
service_enabled_tts=true

# Enable Riva Enterprise
# If enrolled in Enterprise, enable Riva Enterprise by setting configuration
# here. You must explicitly acknowledge you have read and agree to the EULA.
# RIVA_API_KEY=<ngc api key>
# RIVA_API_NGC_ORG=<ngc organization>
# RIVA_EULA=accept

# Language code to fetch models of a specify language
# Currently only ASR supports languages other than English
# Supported language codes: en-US, de-DE, es-US, ru-RU, zh-CN
# for any language other than English, set service_enabled_nlp and service_enabled_tts to False
# for multiple languages enter space separated language codes.
language_code=("en-US")

# Specify one or more GPUs to use
# specifying more than one GPU is currently an experimental feature, and may result in undefined behaviours.
gpus_to_use="device=0"

# Specify the encryption key to use to deploy models
MODEL_DEPLOY_KEY="tlt_encode"

# Locations to use for storing models artifacts
#
# If an absolute path is specified, the data will be written to that location
# Otherwise, a docker volume will be used (default).
#
# riva_init.sh will create a `rmir` and `models` directory in the volume or
# path specified.
#
# RMIR ($riva_model_loc/rmir)
# Riva uses an intermediate representation (RMIR) for models
# that are ready to deploy but not yet fully optimized for deployment. Pretrained
# versions can be obtained from NGC (by specifying NGC models below) and will be
# downloaded to $riva_model_loc/rmir by `riva_init.sh`
#
# Custom models produced by NeMo or TLT and prepared using riva-build
# may also be copied manually to this location $(riva_model_loc/rmir).
#
# Models ($riva_model_loc/models)
# During the riva_init process, the RMIR files in $riva_model_loc/rmir
# are inspected and optimized for deployment. The optimized versions are
# stored in $riva_model_loc/models. The riva server exclusively uses these
# optimized versions.
riva_model_loc="riva-model-repo"

if [[ $riva_target_arch == "arm64" ]]; then
    riva_model_loc="`pwd`/model_repository"
fi

# The default RMIRs are downloaded from NGC by default in the above $riva_rmir_loc directory
# If you'd like to skip the download from NGC and use the existing RMIRs in the $riva_rmir_loc
# then set the below $use_existing_rmirs flag to true. You can also deploy your set of custom
# RMIRs by keeping them in the riva_rmir_loc dir and use this quickstart script with the
# below flag to deploy them all together.
use_existing_rmirs=false

# Ports to expose for Riva services
riva_speech_api_port="50051"

# NGC orgs
riva_ngc_org="nvidia"
riva_ngc_team="riva"
riva_ngc_image_version="2.1.0"
riva_ngc_model_version="2.1.0"

# Pre-built models listed below will be downloaded from NGC. If models already exist in $riva-rmir
# then models can be commented out to skip download from NGC

########## ASR MODELS ##########

models_asr=()

### Citrinet-1024 models
for lang_code in ${language_code[@]}; do
    modified_lang_code="${lang_code/-/_}"
    modified_lang_code=${modified_lang_code,,}
    models_asr+=(
    ### Citrinet-1024 Streaming w/ CPU decoder, best latency configuration
        "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_str:${riva_ngc_model_version}"

    ### Citrinet-1024 Streaming w/ CPU decoder, best throughput configuration
    #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_str_thr:${riva_ngc_model_version}"

    ### Citrinet-1024 Offline w/ CPU decoder,
        "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_ofl:${riva_ngc_model_version}"
    )

    ### Punctuation model
    if [[ "${lang_code}"  == "en-US" || "${lang_code}" == "de-DE" || "${lang_code}" == "es-US"  ]]; then
      models_asr+=(
          "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_punctuation_bert_base_${modified_lang_code}:${riva_ngc_model_version}"
      )
    fi

done

#Other ASR models
models_asr+=(

### Conformer acoustic model, CPU decoder, streaming best latency configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_str:${riva_ngc_model_version}"

### Conformer acoustic model, CPU decoder, streaming best throughput configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_str_thr:${riva_ngc_model_version}"

### Conformer acoustic model, CPU decoder, offline configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_ofl:${riva_ngc_model_version}"

### German Conformer acoustic model, CPU decoder, streaming best latency configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_str:${riva_ngc_model_version}"

### German Conformer acoustic model, CPU decoder, streaming best throughput configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_str_thr:${riva_ngc_model_version}"

### German Conformer acoustic model, CPU decoder, offline configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_ofl:${riva_ngc_model_version}"

### Spanish Conformer acoustic model, CPU decoder, streaming best latency configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_str:${riva_ngc_model_version}"

### Spanish Conformer acoustic model, CPU decoder, streaming best throughput configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_str_thr:${riva_ngc_model_version}"

### Spanish Conformer acoustic model, CPU decoder, offline configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_ofl:${riva_ngc_model_version}"

### Jasper Streaming w/ CPU decoder, best latency configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str:${riva_ngc_model_version}"

### Jasper Streaming w/ CPU decoder, best throughput configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_thr:${riva_ngc_model_version}"

###  Jasper Offline w/ CPU decoder
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_ofl:${riva_ngc_model_version}"

### Quarztnet Streaming w/ CPU decoder, best latency configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_str:${riva_ngc_model_version}"

### Quarztnet Streaming w/ CPU decoder, best throughput configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_str_thr:${riva_ngc_model_version}"

### Quarztnet Offline w/ CPU decoder
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_ofl:${riva_ngc_model_version}"

### Jasper Streaming w/ GPU decoder, best latency configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_gpu_decoder:${riva_ngc_model_version}"

### Jasper Streaming w/ GPU decoder, best throughput configuration
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_thr_gpu_decoder:${riva_ngc_model_version}"

### Jasper Offline w/ GPU decoder
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_ofl_gpu_decoder:${riva_ngc_model_version}"
)

########## NLP MODELS ##########

models_nlp=(
### Bert base Punctuation model
    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_punctuation_bert_base_en_us:${riva_ngc_model_version}"

### BERT base Named Entity Recognition model fine-tuned on GMB dataset with class labels LOC, PER, ORG etc.
    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_named_entity_recognition_bert_base:${riva_ngc_model_version}"

### BERT Base Intent Slot model fine-tuned on weather dataset.
    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_intent_slot_bert_base:${riva_ngc_model_version}"

### BERT Base Question Answering model fine-tuned on Squad v2.
    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_bert_base:${riva_ngc_model_version}"

### Megatron345M Question Answering model fine-tuned on Squad v2.
#    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_megatron:${riva_ngc_model_version}"

### Bert base Text Classification model fine-tuned on 4class (weather, meteorology, personality, nomatch) domain model.
    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_text_classification_bert_base:${riva_ngc_model_version}"
)

########## TTS MODELS ##########

models_tts=(
   "${riva_ngc_org}/${riva_ngc_team}/rmir_tts_fastpitch_hifigan_en_us_female_1:${riva_ngc_model_version}"
#   "${riva_ngc_org}/${riva_ngc_team}/rmir_tts_fastpitch_hifigan_en_us_male_1:${riva_ngc_model_version}"
)

############# Models to use for arm64 platform ##########

if [[ $riva_target_arch == "arm64" ]]; then
  models_asr=(
  ### Citrinet-256 Streaming w/ CPU decoder
      "${riva_ngc_org}/${riva_ngc_team}/models_asr_citrinet_256_en_us_streaming:${riva_ngc_model_version}-$riva_target_arch"
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_en_us_str:${riva_ngc_model_version}"
  )

  models_nlp=(
  ### Bert base Punctuation model
    "${riva_ngc_org}/${riva_ngc_team}/models_nlp_punctuation_bert_base_en_us:${riva_ngc_model_version}-$riva_target_arch"

  ### DistilBERT based misty domain (weather, smalltalk/personality, poi/map)
      "${riva_ngc_org}/${riva_ngc_team}/models_nlp_intent_slot_misty_distilbert:${riva_ngc_model_version}-$riva_target_arch"
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_intent_slot_bert_base:${riva_ngc_model_version}"
  )

  models_tts=(
  ### Fastpitch + HiFiGAN (English-US Female voice)
      "${riva_ngc_org}/${riva_ngc_team}/models_tts_fastpitch_hifigan_en_us_female_1:${riva_ngc_model_version}-$riva_target_arch"
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_tts_fastpitch_hifigan_en_us_male_1:${riva_ngc_model_version}"
  )
fi

NGC_TARGET=${riva_ngc_org}
if [[ ! -z ${riva_ngc_team} ]]; then
  NGC_TARGET="${NGC_TARGET}/${riva_ngc_team}"
else
  team="\"\""
fi

# Specify paths to SSL Key and Certificate files to use TLS/SSL Credentials for a secured connection.
# If either are empty, an insecure connection will be used.
# Stored within container at /ssl/servert.crt and /ssl/server.key
# Optional, one can also specify a root certificate, stored within container at /ssl/root_server.crt
ssl_server_cert=""
ssl_server_key=""
ssl_root_cert=""

# define docker images required to run Riva
image_client="nvcr.io/${NGC_TARGET}/riva-speech-client:${riva_ngc_image_version}"
image_speech_api="nvcr.io/${NGC_TARGET}/riva-speech:${riva_ngc_image_version}-server"

# define docker images required to setup Riva
image_init_speech="nvcr.io/${NGC_TARGET}/riva-speech:${riva_ngc_image_version}-servicemaker"

# daemon names
riva_daemon_speech="riva-speech"
if [[ $riva_target_arch != "arm64" ]]; then
    riva_daemon_client="riva-client"
fi

The output of the riva_start.sh is shown

Starting Riva Speech Services. This may take several minutes depending on the number of models deployed.
7b1c103a42bc5ec196171b1b3b5cd583d92012a89a54536441008d4768be99b9
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Waiting for Riva server to load all models...retrying in 10 seconds
Health ready check failed.
Check Riva logs with: docker logs riva-speech

What will be the next step?

I’m also having the same error. I’m using RTX 3090 GPU. Does Riva support 3090 GPUs?

1 Like

@rvinobha could you help us, please?

Hi @rasel

Thanks for your interest in Riva,

Apologies you are facing issue

Thanks for sharing the logs and the config.sh file,

I have a quick suggestion/tryout, can we try to deploy one model and test whether it works fine,
i.e if we are planning to use it for ASR(Speech Recognition) then in config.sh you can do the following stated below

# Enable or Disable Riva Services
service_enabled_asr=true
service_enabled_nlp=false
service_enabled_tts=false

Set the service which needs to be used as true and others as false

Second, we can try to execute using one model, i.e use only a single model and try to comment others #, for example below

########## ASR MODELS ##########

models_asr=()

### Citrinet-1024 models
for lang_code in ${language_code[@]}; do
    modified_lang_code="${lang_code/-/_}"
    modified_lang_code=${modified_lang_code,,}
    models_asr+=(
    ### Citrinet-1024 Streaming w/ CPU decoder, best latency configuration
        "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_str:${riva_ngc_model_version}"

    ### Citrinet-1024 Streaming w/ CPU decoder, best throughput configuration
    #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_str_thr:${riva_ngc_model_version}"

    ### Citrinet-1024 Offline w/ CPU decoder,
    #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_ofl:${riva_ngc_model_version}"
    )

Can you please try out the above suggestion and let me know if it works, as i see some traces of GPU out of memory in the init logs

Thanks