Error creating GRPC channel: Unable to establish connection to server

NSDB · August 18, 2022, 11:28pm

ISSUE:

When testing riva_asr_client sample

riva_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav

We see the console reporting the following issue:

“Error creating GRPC channel: Unable to establish connection to server”

Please provide the following information when requesting support.

Hardware - GPU A100
Nvidia Driver Version : 510.73.08
CUDA Version : 11.6

Hardware - CPU: Intel Core Processor (Broadwell, no TSX, IBRS)
Operating System: Ubuntu 22.04.1 LTS
Riva Version: “riva_quickstart:2.3.0”

Simply following the instructions:

config.sh

# Copyright (c) 2022, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

# Architecture of target platform. Supported architectures: amd64, arm64
riva_target_arch="amd64"

# Legacy arm64 platform to be enabled. Supported legacy platforms: xavier
riva_arm64_legacy_platform=""

# Enable or Disable Riva Services
service_enabled_asr=true
service_enabled_nlp=false
service_enabled_tts=false

# Enable Riva Enterprise
# If enrolled in Enterprise, enable Riva Enterprise by setting configuration
# here. You must explicitly acknowledge you have read and agree to the EULA.
# RIVA_API_KEY=<ngc api key>
# RIVA_API_NGC_ORG=<ngc organization>
# RIVA_EULA=accept

# Language code to fetch models of a specify language
# Currently only ASR supports languages other than English
# Supported language codes: en-US, de-DE, es-US, ru-RU, zh-CN, hi-IN
# for any language other than English, set service_enabled_nlp and service_enabled_tts to False
# for multiple languages enter space separated language codes.
language_code=("en-US")

# Specify one or more GPUs to use
# specifying more than one GPU is currently an experimental feature, and may result in undefined behaviours.
gpus_to_use="device=0"

# Specify the encryption key to use to deploy models
MODEL_DEPLOY_KEY="tlt_encode"

# Locations to use for storing models artifacts
#
# If an absolute path is specified, the data will be written to that location
# Otherwise, a docker volume will be used (default).
#
# riva_init.sh will create a `rmir` and `models` directory in the volume or
# path specified.
#
# RMIR ($riva_model_loc/rmir)
# Riva uses an intermediate representation (RMIR) for models
# that are ready to deploy but not yet fully optimized for deployment. Pretrained
# versions can be obtained from NGC (by specifying NGC models below) and will be
# downloaded to $riva_model_loc/rmir by `riva_init.sh`
#
# Custom models produced by NeMo or TLT and prepared using riva-build
# may also be copied manually to this location $(riva_model_loc/rmir).
#
# Models ($riva_model_loc/models)
# During the riva_init process, the RMIR files in $riva_model_loc/rmir
# are inspected and optimized for deployment. The optimized versions are
# stored in $riva_model_loc/models. The riva server exclusively uses these
# optimized versions.
riva_model_loc="riva-model-repo"

if [[ $riva_target_arch == "arm64" ]]; then
    riva_model_loc="`pwd`/model_repository"
fi

# The default RMIRs are downloaded from NGC by default in the above $riva_rmir_loc directory
# If you'd like to skip the download from NGC and use the existing RMIRs in the $riva_rmir_loc
# then set the below $use_existing_rmirs flag to true. You can also deploy your set of custom
# RMIRs by keeping them in the riva_rmir_loc dir and use this quickstart script with the
# below flag to deploy them all together.
use_existing_rmirs=false

# Ports to expose for Riva services
riva_speech_api_port="50051"

# NGC orgs
riva_ngc_org="nvidia"
riva_ngc_team="riva"
riva_ngc_image_version="2.3.0"
riva_ngc_model_version="2.3.0"

# Pre-built models listed below will be downloaded from NGC. If models already exist in $riva-rmir
# then models can be commented out to skip download from NGC

########## ASR MODELS ##########

models_asr=()

### Citrinet-1024 models
for lang_code in ${language_code[@]}; do
    modified_lang_code="${lang_code/-/_}"
    modified_lang_code=${modified_lang_code,,}
    if [[ $riva_target_arch == "arm64" ]]; then
      models_asr+=(
      ### Citrinet-1024 Streaming w/ CPU decoder, best latency configuration
          "${riva_ngc_org}/${riva_ngc_team}/models_asr_citrinet_1024_${modified_lang_code}_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
      )
    else
      models_asr+=(
      ### Citrinet-1024 Streaming w/ CPU decoder, best latency configuration
          "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_str:${riva_ngc_model_version}"

      ### Citrinet-1024 Streaming w/ CPU decoder, best throughput configuration
      #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_str_thr:${riva_ngc_model_version}"

      ### Citrinet-1024 Offline w/ CPU decoder,
          "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_ofl:${riva_ngc_model_version}"
      )
    fi

    ### Punctuation model
    if [[ "${lang_code}"  == "en-US" || "${lang_code}" == "de-DE" || "${lang_code}" == "es-US" || "${lang_code}" == "zh-CN" ]]; then
      if [[ $riva_target_arch == "arm64" ]]; then
        models_asr+=(
            "${riva_ngc_org}/${riva_ngc_team}/models_nlp_punctuation_bert_base_${modified_lang_code}:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
        )
      else
        models_asr+=(
            "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_punctuation_bert_base_${modified_lang_code}:${riva_ngc_model_version}"
        )
      fi
    fi

done

#Other ASR models
if [[ $riva_target_arch == "arm64" ]]; then
  models_asr+=(
  ### Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_en_us_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### German Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_de_de_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### Spanish Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_es_us_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### Hindi Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_hi_in_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### Russian Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_ru_ru_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### Citrinet-256 Streaming w/ CPU decoder, best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_citrinet_256_en_us_streaming:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
  )
else
  models_asr+=(
  ### Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_str:${riva_ngc_model_version}"

  ### Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_str_thr:${riva_ngc_model_version}"

  ### Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_ofl:${riva_ngc_model_version}"

  ### German Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_str:${riva_ngc_model_version}"

  ### German Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_str_thr:${riva_ngc_model_version}"

  ### German Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_ofl:${riva_ngc_model_version}"

  ### Spanish Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_str:${riva_ngc_model_version}"

  ### Spanish Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_str_thr:${riva_ngc_model_version}"

  ### Spanish Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_ofl:${riva_ngc_model_version}"

  ### Hindi Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_hi_in_str:${riva_ngc_model_version}"

  ### Hindi Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_hi_in_str_thr:${riva_ngc_model_version}"

  ### Hindi Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_hi_in_ofl:${riva_ngc_model_version}"
  
  ### Russian Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_ru_ru_str:${riva_ngc_model_version}"

  ### Russian Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_ru_ru_str_thr:${riva_ngc_model_version}"

  ### Russian Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_ru_ru_ofl:${riva_ngc_model_version}"

  ### Jasper Streaming w/ CPU decoder, best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str:${riva_ngc_model_version}"

  ### Jasper Streaming w/ CPU decoder, best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_thr:${riva_ngc_model_version}"

  ###  Jasper Offline w/ CPU decoder
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_ofl:${riva_ngc_model_version}"

  ### Quarztnet Streaming w/ CPU decoder, best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_str:${riva_ngc_model_version}"

  ### Quarztnet Streaming w/ CPU decoder, best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_str_thr:${riva_ngc_model_version}"

  ### Quarztnet Offline w/ CPU decoder
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_ofl:${riva_ngc_model_version}"

  ### Jasper Streaming w/ GPU decoder, best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_gpu_decoder:${riva_ngc_model_version}"

  ### Jasper Streaming w/ GPU decoder, best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_thr_gpu_decoder:${riva_ngc_model_version}"

  ### Jasper Offline w/ GPU decoder
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_ofl_gpu_decoder:${riva_ngc_model_version}"
  )
fi

########## NLP MODELS ##########

if [[ $riva_target_arch == "arm64" ]]; then
  models_nlp=(
  ### BERT Base Intent Slot model for misty domain fine-tuned on weather, smalltalk/personality, poi/map datasets.
      "${riva_ngc_org}/${riva_ngc_team}/models_nlp_intent_slot_misty_bert_base:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### DistilBERT Intent Slot model for misty domain fine-tuned on weather, smalltalk/personality, poi/map datasets.
  #    "${riva_ngc_org}/${riva_ngc_team}/models_nlp_intent_slot_misty_distilbert:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
  )
else
  models_nlp=(
  ### Bert base Punctuation model
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_punctuation_bert_base_en_us:${riva_ngc_model_version}"

  ### BERT base Named Entity Recognition model fine-tuned on GMB dataset with class labels LOC, PER, ORG etc.
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_named_entity_recognition_bert_base:${riva_ngc_model_version}"

  ### BERT Base Intent Slot model fine-tuned on weather dataset.
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_intent_slot_bert_base:${riva_ngc_model_version}"

  ### BERT Base Question Answering model fine-tuned on Squad v2.
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_bert_base:${riva_ngc_model_version}"

  ### Megatron345M Question Answering model fine-tuned on Squad v2.
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_megatron:${riva_ngc_model_version}"

  ### Bert base Text Classification model fine-tuned on 4class (weather, meteorology, personality, nomatch) domain model.
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_text_classification_bert_base:${riva_ngc_model_version}"
  )
fi

########## TTS MODELS ##########

if [[ $riva_target_arch == "arm64" ]]; then
  models_tts=(
     "${riva_ngc_org}/${riva_ngc_team}/models_tts_fastpitch_hifigan_en_us_female_1:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
  #   "${riva_ngc_org}/${riva_ngc_team}/models_tts_fastpitch_hifigan_en_us_male_1:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
  )
else
  models_tts=(
     "${riva_ngc_org}/${riva_ngc_team}/rmir_tts_fastpitch_hifigan_en_us_female_1:${riva_ngc_model_version}"
  #   "${riva_ngc_org}/${riva_ngc_team}/rmir_tts_fastpitch_hifigan_en_us_male_1:${riva_ngc_model_version}"
  )
fi

NGC_TARGET=${riva_ngc_org}
if [[ ! -z ${riva_ngc_team} ]]; then
  NGC_TARGET="${NGC_TARGET}/${riva_ngc_team}"
else
  team="\"\""
fi

# Specify paths to SSL Key and Certificate files to use TLS/SSL Credentials for a secured connection.
# If either are empty, an insecure connection will be used.
# Stored within container at /ssl/servert.crt and /ssl/server.key
# Optional, one can also specify a root certificate, stored within container at /ssl/root_server.crt
ssl_server_cert=""
ssl_server_key=""
ssl_root_cert=""

# define docker images required to run Riva
image_client="nvcr.io/${NGC_TARGET}/riva-speech-client:${riva_ngc_image_version}"
image_speech_api="nvcr.io/${NGC_TARGET}/riva-speech:${riva_ngc_image_version}-server"

# define docker images required to setup Riva
image_init_speech="nvcr.io/${NGC_TARGET}/riva-speech:${riva_ngc_image_version}-servicemaker"

# daemon names
riva_daemon_speech="riva-speech"
if [[ $riva_target_arch != "arm64" ]]; then
    riva_daemon_client="riva-client"
fi

We are able to run these without error:

riva_clean.sh
bash riva_init.sh
bash riva_start.sh
bash riva_stop.sh
bash riva_start_client.sh

But when we attempt the suggested ASR sample:

riva_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav

The console reports as follows:

Error creating GRPC channel: Unable to establish connection to server. Current state: 3

Exiting.

There appears to be some expectation of a file - within the container, i believe

/ssl/server.crt

…which doesn’t exist? So it defaults to “Using Insecure Server Credentials” - but fails, as noted above.

Any thoughts / guidance / requests ?

Thank you much.

PS. I see a similar issue reported here:

CreateChannel(): Could not get default pem root certs · Issue #26106 · grpc/grpc (github.com)

…but not sure how (if at all) to apply that insight to our current issue. Any suggestions greatly appreciated. Thank you again.

UPDATE: (2022.08.19)

1. We changed the name of the quickstart directory:

note: an error is issued if the path to the certificate contains a “period”.

mv /_your_path_/riva_quickstart_v2.3.0 /_your_path_/riva_quickstart_v2-3-0

2. Generated some self-signed certificates:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /_your_path_/riva_quickstart_v2-3-0/server.key -out /_your_path_/riva_quickstart_v2-3-0/server.crt

3. Updated the config.sh:

# Specify paths to SSL Key and Certificate files to use TLS/SSL Credentials for a secured connection.
# If either are empty, an insecure connection will be used.
# Stored within container at /ssl/servert.crt and /ssl/server.key
# Optional, one can also specify a root certificate, stored within container at /ssl/root_server.crt
ssl_server_cert="/_your_path_/riva_quickstart_v2-3-0/server.crt"
ssl_server_key="/_your_path_/riva_quickstart_v2-3-0/server.key"
ssl_root_cert=""

Restart:

bash riva_init.sh
bash riva_start.sh

Riva server is ready..

Unfortunately, that did not seem to help…

bash riva_start_client.sh
riva_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav

The response is unchanged:

root@tengine:/work# riva_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav

I0819 22:12:50.504925 10 riva_asr_client.cc:434] Using Insecure Server Credentials

Error creating GRPC channel: Unable to establish connection to server. Current state: 3

Exiting.

:/

rvinobha · August 23, 2022, 8:35am

Hi @NSDB

Thanks for your interest in Riva,

SSL is not required/mandatory, we can try without it

Please leave the ssl_server_cert and ssl_server_key as empty string, run bash riva_clean.sh and bash riva_init.sh and bash riva_start.sh

After than, Can you please share the complete log output of docker logs riva-speech as file in this thread

Thanks

NSDB · August 23, 2022, 9:04pm

Thank you for the prompt follow-up.

Per your instructions and request:

root@tengine:~/riva_quickstart_v2-3-0# docker logs riva-speech

==========================
=== Riva Speech Skills ===
==========================

NVIDIA Release 22.06 (build 40051835)
Riva Speech Server Version 2.3.0

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for Riva Speech Server.  NVIDIA recommends the use of the following flags:
   docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...

  > Riva waiting for Triton server to load all models...retrying in 1 second
I0823 19:16:16.643175 105 onnxruntime.cc:2319] TRITONBACKEND_Initialize: onnxruntime
I0823 19:16:16.643367 105 onnxruntime.cc:2329] Triton TRITONBACKEND API version: 1.8
I0823 19:16:16.643552 105 onnxruntime.cc:2335] 'onnxruntime' TRITONBACKEND API version: 1.8
I0823 19:16:16.643624 105 onnxruntime.cc:2365] backend configuration:
{}
I0823 19:16:17.183750 105 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x10020000000' with size 268435456
I0823 19:16:17.184628 105 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000
I0823 19:16:17.193064 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline:1
I0823 19:16:17.293410 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-feature-extractor-offline:1
I0823 19:16:17.393706 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline:1
I0823 19:16:17.399716 105 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0823 19:16:17.402658   111 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0823 19:16:17.402776   111 parameter_parser.cc:121] Default value will be used
W0823 19:16:17.402923   111 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0823 19:16:17.402971   111 parameter_parser.cc:121] Default value will be used
W0823 19:16:17.403033   111 parameter_parser.cc:120] Parameter max_num_slots could not be set from parameters
W0823 19:16:17.403072   111 parameter_parser.cc:121] Default value will be used
I0823 19:16:17.404049 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",
	"platform": "",
	"backend": "riva_asr_decoder",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 128,
	"input": [
		{
			"name": "CLASS_LOGITS",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				1025
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "END_FLAG",
			"data_type": "TYPE_UINT32",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "SEGMENTS_START_END",
			"data_type": "TYPE_INT32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				2
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "CUSTOM_CONFIGURATION",
			"data_type": "TYPE_STRING",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				2
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "FINAL_TRANSCRIPTS",
			"data_type": "TYPE_STRING",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "FINAL_TRANSCRIPTS_SCORE",
			"data_type": "TYPE_FP32",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "FINAL_WORDS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_TRANSCRIPTS",
			"data_type": "TYPE_STRING",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_TRANSCRIPTS_STABILITY",
			"data_type": "TYPE_FP32",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_WORDS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"oldest": {
			"max_candidate_sequences": 128,
			"preferred_batch_size": [
				32,
				64
			],
			"max_queue_delay_microseconds": 1000
		},
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "END",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_END",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "CORRID",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_CORRID",
						"int32_false_true": [],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_UINT64"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"forerunner_beam_size_token": {
			"string_value": "8"
		},
		"forerunner_beam_threshold": {
			"string_value": "10.0"
		},
		"decoder_num_worker_threads": {
			"string_value": "-1"
		},
		"asr_model_delay": {
			"string_value": "-1"
		},
		"word_insertion_score": {
			"string_value": "0.2"
		},
		"left_padding_size": {
			"string_value": "0.0"
		},
		"decoder_type": {
			"string_value": "flashlight"
		},
		"forerunner_beam_size": {
			"string_value": "8"
		},
		"chunk_size": {
			"string_value": "300.0"
		},
		"max_supported_transcripts": {
			"string_value": "1"
		},
		"lexicon_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/lexicon.txt"
		},
		"smearing_mode": {
			"string_value": "max"
		},
		"use_vad": {
			"string_value": "True"
		},
		"blank_token": {
			"string_value": "#"
		},
		"lm_weight": {
			"string_value": "0.2"
		},
		"vocab_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/riva_decoder_vocabulary.txt"
		},
		"ms_per_timestep": {
			"string_value": "80"
		},
		"streaming": {
			"string_value": "False"
		},
		"use_subword": {
			"string_value": "True"
		},
		"beam_size": {
			"string_value": "16"
		},
		"right_padding_size": {
			"string_value": "0.0"
		},
		"beam_size_token": {
			"string_value": "16"
		},
		"sil_token": {
			"string_value": "▁"
		},
		"num_tokenization": {
			"string_value": "1"
		},
		"beam_threshold": {
			"string_value": "20.0"
		},
		"tokenizer_model": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/498056ba420d4bb3831ad557fba06032_tokenizer.model"
		},
		"language_model_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/jarvis_asr_train_datasets_noSpgi_noLS_gt_3gram.binary"
		},
		"max_execution_batch_size": {
			"string_value": "1024"
		},
		"forerunner_use_lm": {
			"string_value": "true"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0823 19:16:17.406562 105 feature-extractor.cc:407] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-feature-extractor-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0823 19:16:17.407653   112 parameter_parser.cc:120] Parameter is_dither_seed_random could not be set from parameters
W0823 19:16:17.407781   112 parameter_parser.cc:121] Default value will be used
W0823 19:16:17.407819   112 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0823 19:16:17.407871   112 parameter_parser.cc:121] Default value will be used
W0823 19:16:17.407905   112 parameter_parser.cc:120] Parameter max_sequence_idle_microseconds could not be set from parameters
W0823 19:16:17.407959   112 parameter_parser.cc:121] Default value will be used
W0823 19:16:17.407996   112 parameter_parser.cc:120] Parameter preemph_coeff could not be set from parameters
W0823 19:16:17.408047   112 parameter_parser.cc:121] Default value will be used
I0823 19:16:17.494029 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming:1
I0823 19:16:17.559773 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-offline-feature-extractor-offline",
	"platform": "",
	"backend": "riva_asr_features",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 1,
	"input": [
		{
			"name": "AUDIO_SIGNAL",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "SAMPLE_RATE",
			"data_type": "TYPE_UINT32",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "AUDIO_FEATURES",
			"data_type": "TYPE_FP32",
			"dims": [
				80,
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "AUDIO_PROCESSED",
			"data_type": "TYPE_FP32",
			"dims": [
				1
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"oldest": {
			"max_candidate_sequences": 1,
			"preferred_batch_size": [
				1
			],
			"max_queue_delay_microseconds": 1000
		},
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "END",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_END",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "CORRID",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_CORRID",
						"int32_false_true": [],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_UINT64"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-offline-feature-extractor-offline_0",
			"kind": "KIND_GPU",
			"count": 1,
			"gpus": [
				0
			],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"norm_per_feature": {
			"string_value": "True"
		},
		"mean": {
			"string_value": "-11.4412,  -9.9334,  -9.1292,  -9.0365,  -9.2804,  -9.5643,  -9.7342, -9.6925,  -9.6333,  -9.2808,  -9.1887,  -9.1422,  -9.1397,  -9.2028, -9.2749,  -9.4776,  -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734,  -9.9257,  -9.6557,  -9.1761, -9.6653,  -9.7876,  -9.7230,  -9.7792,  -9.7056,  -9.2702,  -9.4650, -9.2755,  -9.1369,  -9.1174,  -8.9197,  -8.5394,  -8.2614,  -8.1353, -8.1422,  -8.3430,  -8.6655"
		},
		"stddev": {
			"string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181"
		},
		"chunk_size": {
			"string_value": "300.0"
		},
		"max_execution_batch_size": {
			"string_value": "1"
		},
		"sample_rate": {
			"string_value": "16000"
		},
		"num_features": {
			"string_value": "80"
		},
		"window_size": {
			"string_value": "0.025"
		},
		"window_stride": {
			"string_value": "0.01"
		},
		"streaming": {
			"string_value": "False"
		},
		"left_padding_size": {
			"string_value": "0.0"
		},
		"stddev_floor": {
			"string_value": "1e-05"
		},
		"transpose": {
			"string_value": "False"
		},
		"right_padding_size": {
			"string_value": "0.0"
		},
		"gain": {
			"string_value": "1.0"
		},
		"precalc_norm_time_steps": {
			"string_value": "0"
		},
		"use_utterance_norm_params": {
			"string_value": "False"
		},
		"dither": {
			"string_value": "0.0"
		},
		"precalc_norm_params": {
			"string_value": "False"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0823 19:16:17.560576 105 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline_0 (device 0)
I0823 19:16:17.594363 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming:1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0823 19:16:17.694672 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming:1
I0823 19:16:17.795074 105 model_repository_manager.cc:994] loading: riva-punctuation-en-US:1
I0823 19:16:17.895508 105 model_repository_manager.cc:994] loading: riva-trt-citrinet-1024-en-US-asr-offline-am-offline:1
I0823 19:16:17.995966 105 model_repository_manager.cc:994] loading: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming:1
I0823 19:16:18.096361 105 model_repository_manager.cc:994] loading: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased:1
I0823 19:16:18.504459   111 ctc-decoder.cc:171] Beam Decoder initialized successfully!
I0823 19:16:18.505657 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline' version 1
I0823 19:16:18.507313 105 vad_library.cc:18] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0823 19:16:18.508060   114 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0823 19:16:18.508173   114 parameter_parser.cc:121] Default value will be used
W0823 19:16:18.508235   114 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0823 19:16:18.508282   114 parameter_parser.cc:121] Default value will be used
I0823 19:16:18.508990 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline",
	"platform": "",
	"backend": "riva_asr_vad",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 2048,
	"input": [
		{
			"name": "CLASS_LOGITS",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				1025
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "SEGMENTS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"vad_stop_history": {
			"string_value": "800"
		},
		"vad_start_history": {
			"string_value": "300"
		},
		"chunk_size": {
			"string_value": "300.0"
		},
		"vad_start_th": {
			"string_value": "0.2"
		},
		"vad_stop_th": {
			"string_value": "0.98"
		},
		"vad_type": {
			"string_value": "ctc-vad"
		},
		"vocab_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline/1/riva_decoder_vocabulary.txt"
		},
		"residue_blanks_at_start": {
			"string_value": "0"
		},
		"ms_per_timestep": {
			"string_value": "80"
		},
		"streaming": {
			"string_value": "False"
		},
		"use_subword": {
			"string_value": "True"
		},
		"residue_blanks_at_end": {
			"string_value": "0"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0823 19:16:18.509540 105 feature-extractor.cc:409] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-feature-extractor-offline_0 (device 0)
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0823 19:16:21.944181 105 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1)
W0823 19:16:21.945042   117 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0823 19:16:21.945108   117 parameter_parser.cc:121] Default value will be used
W0823 19:16:21.945235   117 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0823 19:16:21.945286   117 parameter_parser.cc:121] Default value will be used
W0823 19:16:21.945322   117 parameter_parser.cc:120] Parameter max_num_slots could not be set from parameters
W0823 19:16:21.945379   117 parameter_parser.cc:121] Default value will be used
I0823 19:16:21.945872 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",
	"platform": "",
	"backend": "riva_asr_decoder",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 1024,
	"input": [
		{
			"name": "CLASS_LOGITS",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				1025
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "END_FLAG",
			"data_type": "TYPE_UINT32",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "SEGMENTS_START_END",
			"data_type": "TYPE_INT32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				2
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "CUSTOM_CONFIGURATION",
			"data_type": "TYPE_STRING",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				2
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "FINAL_TRANSCRIPTS",
			"data_type": "TYPE_STRING",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "FINAL_TRANSCRIPTS_SCORE",
			"data_type": "TYPE_FP32",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "FINAL_WORDS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_TRANSCRIPTS",
			"data_type": "TYPE_STRING",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_TRANSCRIPTS_STABILITY",
			"data_type": "TYPE_FP32",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_WORDS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"oldest": {
			"max_candidate_sequences": 1024,
			"preferred_batch_size": [
				32,
				64
			],
			"max_queue_delay_microseconds": 1000
		},
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "END",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_END",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "CORRID",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_CORRID",
						"int32_false_true": [],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_UINT64"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"beam_size_token": {
			"string_value": "16"
		},
		"sil_token": {
			"string_value": "▁"
		},
		"num_tokenization": {
			"string_value": "1"
		},
		"beam_threshold": {
			"string_value": "20.0"
		},
		"tokenizer_model": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/498056ba420d4bb3831ad557fba06032_tokenizer.model"
		},
		"language_model_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/jarvis_asr_train_datasets_noSpgi_noLS_gt_3gram.binary"
		},
		"max_execution_batch_size": {
			"string_value": "1024"
		},
		"forerunner_use_lm": {
			"string_value": "true"
		},
		"forerunner_beam_size_token": {
			"string_value": "8"
		},
		"forerunner_beam_threshold": {
			"string_value": "10.0"
		},
		"asr_model_delay": {
			"string_value": "-1"
		},
		"decoder_num_worker_threads": {
			"string_value": "-1"
		},
		"word_insertion_score": {
			"string_value": "0.2"
		},
		"left_padding_size": {
			"string_value": "1.92"
		},
		"decoder_type": {
			"string_value": "flashlight"
		},
		"forerunner_beam_size": {
			"string_value": "8"
		},
		"max_supported_transcripts": {
			"string_value": "1"
		},
		"chunk_size": {
			"string_value": "0.16"
		},
		"lexicon_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt"
		},
		"smearing_mode": {
			"string_value": "max"
		},
		"use_vad": {
			"string_value": "True"
		},
		"blank_token": {
			"string_value": "#"
		},
		"lm_weight": {
			"string_value": "0.2"
		},
		"vocab_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt"
		},
		"ms_per_timestep": {
			"string_value": "80"
		},
		"streaming": {
			"string_value": "True"
		},
		"use_subword": {
			"string_value": "True"
		},
		"beam_size": {
			"string_value": "16"
		},
		"right_padding_size": {
			"string_value": "1.92"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0823 19:16:21.946807 105 vad_library.cc:18] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming (version 1)
W0823 19:16:21.947309   121 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0823 19:16:21.947361   121 parameter_parser.cc:121] Default value will be used
W0823 19:16:21.947436   121 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0823 19:16:21.947482   121 parameter_parser.cc:121] Default value will be used
I0823 19:16:21.947918 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming",
	"platform": "",
	"backend": "riva_asr_vad",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 2048,
	"input": [
		{
			"name": "CLASS_LOGITS",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				1025
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "SEGMENTS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"chunk_size": {
			"string_value": "0.16"
		},
		"vad_stop_th": {
			"string_value": "0.98"
		},
		"vad_start_th": {
			"string_value": "0.2"
		},
		"vad_type": {
			"string_value": "ctc-vad"
		},
		"vocab_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming/1/riva_decoder_vocabulary.txt"
		},
		"ms_per_timestep": {
			"string_value": "80"
		},
		"residue_blanks_at_start": {
			"string_value": "-2"
		},
		"use_subword": {
			"string_value": "True"
		},
		"streaming": {
			"string_value": "True"
		},
		"residue_blanks_at_end": {
			"string_value": "0"
		},
		"vad_start_history": {
			"string_value": "300"
		},
		"vad_stop_history": {
			"string_value": "800"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0823 19:16:21.949043 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-feature-extractor-offline' version 1
I0823 19:16:21.980473 105 feature-extractor.cc:407] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming (version 1)
W0823 19:16:21.981178   120 parameter_parser.cc:120] Parameter is_dither_seed_random could not be set from parameters
W0823 19:16:21.981242   120 parameter_parser.cc:121] Default value will be used
W0823 19:16:21.981294   120 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0823 19:16:21.981338   120 parameter_parser.cc:121] Default value will be used
W0823 19:16:21.981371   120 parameter_parser.cc:120] Parameter max_sequence_idle_microseconds could not be set from parameters
W0823 19:16:21.981429   120 parameter_parser.cc:121] Default value will be used
W0823 19:16:21.981467   120 parameter_parser.cc:120] Parameter preemph_coeff could not be set from parameters
W0823 19:16:21.981518   120 parameter_parser.cc:121] Default value will be used
I0823 19:16:21.997781 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",
	"platform": "",
	"backend": "riva_asr_features",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 1024,
	"input": [
		{
			"name": "AUDIO_SIGNAL",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "SAMPLE_RATE",
			"data_type": "TYPE_UINT32",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "AUDIO_FEATURES",
			"data_type": "TYPE_FP32",
			"dims": [
				80,
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "AUDIO_PROCESSED",
			"data_type": "TYPE_FP32",
			"dims": [
				1
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"oldest": {
			"max_candidate_sequences": 1024,
			"preferred_batch_size": [
				256,
				512
			],
			"max_queue_delay_microseconds": 1000
		},
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "END",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_END",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "CORRID",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_CORRID",
						"int32_false_true": [],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_UINT64"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-streaming-feature-extractor-streaming_0",
			"kind": "KIND_GPU",
			"count": 1,
			"gpus": [
				0
			],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"chunk_size": {
			"string_value": "0.16"
		},
		"max_execution_batch_size": {
			"string_value": "1024"
		},
		"sample_rate": {
			"string_value": "16000"
		},
		"num_features": {
			"string_value": "80"
		},
		"window_size": {
			"string_value": "0.025"
		},
		"window_stride": {
			"string_value": "0.01"
		},
		"streaming": {
			"string_value": "True"
		},
		"transpose": {
			"string_value": "False"
		},
		"left_padding_size": {
			"string_value": "1.92"
		},
		"stddev_floor": {
			"string_value": "1e-05"
		},
		"right_padding_size": {
			"string_value": "1.92"
		},
		"gain": {
			"string_value": "1.0"
		},
		"use_utterance_norm_params": {
			"string_value": "False"
		},
		"precalc_norm_time_steps": {
			"string_value": "0"
		},
		"precalc_norm_params": {
			"string_value": "False"
		},
		"dither": {
			"string_value": "1e-05"
		},
		"norm_per_feature": {
			"string_value": "True"
		},
		"mean": {
			"string_value": "-11.4412,  -9.9334,  -9.1292,  -9.0365,  -9.2804,  -9.5643,  -9.7342, -9.6925,  -9.6333,  -9.2808,  -9.1887,  -9.1422,  -9.1397,  -9.2028, -9.2749,  -9.4776,  -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734,  -9.9257,  -9.6557,  -9.1761, -9.6653,  -9.7876,  -9.7230,  -9.7792,  -9.7056,  -9.2702,  -9.4650, -9.2755,  -9.1369,  -9.1174,  -8.9197,  -8.5394,  -8.2614,  -8.1353, -8.1422,  -8.3430,  -8.6655"
		},
		"stddev": {
			"string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0823 19:16:21.998558 105 vad_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline_0 (device 0)
I0823 19:16:22.112938 105 vad_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming_0 (device 0)
I0823 19:16:22.120742 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline' version 1
I0823 19:16:22.228394 105 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0)
I0823 19:16:22.236720 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0823 19:16:23.120062   117 ctc-decoder.cc:171] Beam Decoder initialized successfully!
I0823 19:16:23.120782 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1
I0823 19:16:23.168083 105 pipeline_library.cc:19] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0823 19:16:23.168710   122 parameter_parser.cc:120] Parameter bos could not be set from parameters
W0823 19:16:23.168828   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.168874   122 parameter_parser.cc:120] Parameter dropout_prob could not be set from parameters
W0823 19:16:23.168905   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.168957   122 parameter_parser.cc:120] Parameter eos could not be set from parameters
W0823 19:16:23.168992   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169049   122 parameter_parser.cc:120] Parameter reverse could not be set from parameters
W0823 19:16:23.169083   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169164   122 parameter_parser.cc:120] Parameter bos could not be set from parameters
W0823 19:16:23.169209   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169242   122 parameter_parser.cc:120] Parameter doc_stride could not be set from parameters
W0823 19:16:23.169293   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169327   122 parameter_parser.cc:120] Parameter dropout_prob could not be set from parameters
W0823 19:16:23.169380   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169423   122 parameter_parser.cc:120] Parameter eos could not be set from parameters
W0823 19:16:23.169474   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169509   122 parameter_parser.cc:120] Parameter margin could not be set from parameters
W0823 19:16:23.169562   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169597   122 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0823 19:16:23.169656   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169689   122 parameter_parser.cc:120] Parameter max_query_length could not be set from parameters
W0823 19:16:23.169741   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169776   122 parameter_parser.cc:120] Parameter max_seq_length could not be set from parameters
W0823 19:16:23.169829   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169867   122 parameter_parser.cc:120] Parameter reverse could not be set from parameters
W0823 19:16:23.169917   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.169951   122 parameter_parser.cc:120] Parameter step could not be set from parameters
W0823 19:16:23.170004   122 parameter_parser.cc:121] Default value will be used
W0823 19:16:23.170038   122 parameter_parser.cc:120] Parameter task could not be set from parameters
W0823 19:16:23.170091   122 parameter_parser.cc:121] Default value will be used
I0823 19:16:23.170268 105 backend_model.cc:255] model configuration:
{
	"name": "riva-punctuation-en-US",
	"platform": "",
	"backend": "riva_nlp_pipeline",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 8,
	"input": [
		{
			"name": "PIPELINE_INPUT",
			"data_type": "TYPE_STRING",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "PIPELINE_OUTPUT",
			"data_type": "TYPE_STRING",
			"dims": [
				1
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"instance_group": [
		{
			"name": "riva-punctuation-en-US_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"tokenizer_to_lower": {
			"string_value": "true"
		},
		"model_family": {
			"string_value": "riva"
		},
		"unk_token": {
			"string_value": "[UNK]"
		},
		"vocab": {
			"string_value": "/data/models/riva-punctuation-en-US/1/tokenizer.vocab_file"
		},
		"bos_token": {
			"string_value": "[CLS]"
		},
		"capit_logits_tensor_name": {
			"string_value": "capit_token_logits"
		},
		"punctuation_mapping_path": {
			"string_value": "/data/models/riva-punctuation-en-US/1/punct_label_ids.csv"
		},
		"model_api": {
			"string_value": "/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText"
		},
		"to_lower": {
			"string_value": "true"
		},
		"pipeline_type": {
			"string_value": "punctuation"
		},
		"capitalization_mapping_path": {
			"string_value": "/data/models/riva-punctuation-en-US/1/capit_label_ids.csv"
		},
		"eos_token": {
			"string_value": "[SEP]"
		},
		"load_model": {
			"string_value": "false"
		},
		"attn_mask_tensor_name": {
			"string_value": "input_mask"
		},
		"token_type_tensor_name": {
			"string_value": "segment_ids"
		},
		"punct_logits_tensor_name": {
			"string_value": "punct_token_logits"
		},
		"language_code": {
			"string_value": "en-US"
		},
		"tokenizer": {
			"string_value": "wordpiece"
		},
		"delimiter": {
			"string_value": " "
		},
		"input_ids_tensor_name": {
			"string_value": "input_ids"
		},
		"model_name": {
			"string_value": "riva-trt-riva-punctuation-en-US-nn-bert-base-uncased"
		},
		"pad_chars_with_spaces": {
			"string_value": "False"
		},
		"remove_spaces": {
			"string_value": "False"
		}
	},
	"model_warmup": []
}
I0823 19:16:23.171017 105 pipeline_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0)
I0823 19:16:23.181905 105 model_repository_manager.cc:1149] successfully loaded 'riva-punctuation-en-US' version 1
I0823 19:16:23.182585 105 tensorrt.cc:5145] TRITONBACKEND_Initialize: tensorrt
I0823 19:16:23.182692 105 tensorrt.cc:5155] Triton TRITONBACKEND API version: 1.8
I0823 19:16:23.182766 105 tensorrt.cc:5161] 'tensorrt' TRITONBACKEND API version: 1.8
I0823 19:16:23.183298 105 tensorrt.cc:5204] backend configuration:
{}
I0823 19:16:23.183410 105 feature-extractor.cc:409] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming_0 (device 0)
I0823 19:16:23.467402 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming (version 1)
I0823 19:16:23.467947 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-feature-extractor-streaming' version 1
I0823 19:16:23.468509 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming_0 (GPU device 0)
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0823 19:16:25.338805 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +417, GPU +0, now: CPU 2067, GPU 3606 (MiB)
I0823 19:16:25.670245 105 logging.cc:49] Loaded engine size: 278 MiB
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0823 19:16:25.961551 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +8, now: CPU 2633, GPU 3890 (MiB)
I0823 19:16:26.193929 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +125, GPU +58, now: CPU 2758, GPU 3948 (MiB)
I0823 19:16:26.197953 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +275, now: CPU 0, GPU 275 (MiB)
I0823 19:16:26.228579 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2202, GPU 3940 (MiB)
I0823 19:16:26.229435 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2202, GPU 3948 (MiB)
I0823 19:16:26.262124 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +533, now: CPU 0, GPU 808 (MiB)
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0823 19:16:28.985120 105 tensorrt.cc:1409] Created instance riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0823 19:16:28.985276 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-citrinet-1024-en-US-asr-offline-am-offline (version 1)
I0823 19:16:28.985982 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-citrinet-1024-en-US-asr-offline-am-offline_0 (GPU device 0)
I0823 19:16:28.986571 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3080, GPU 5097 (MiB)
I0823 19:16:28.995817 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming' version 1
I0823 19:16:29.447082 105 logging.cc:49] Loaded engine size: 283 MiB
I0823 19:16:29.734565 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3656, GPU 5389 (MiB)
I0823 19:16:29.735582 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +1, GPU +8, now: CPU 3657, GPU 5397 (MiB)
I0823 19:16:29.738110 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +281, now: CPU 0, GPU 1089 (MiB)
I0823 19:16:29.764556 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3089, GPU 5389 (MiB)
I0823 19:16:29.765387 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3089, GPU 5397 (MiB)
I0823 19:16:29.772827 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +565, now: CPU 1, GPU 1654 (MiB)
I0823 19:16:29.773907 105 tensorrt.cc:1409] Created instance riva-trt-citrinet-1024-en-US-asr-offline-am-offline_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0823 19:16:29.774026 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased (version 1)
I0823 19:16:29.774672 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 (GPU device 0)
I0823 19:16:29.775148 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3090, GPU 6015 (MiB)
I0823 19:16:29.783753 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-citrinet-1024-en-US-asr-offline-am-offline' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0823 19:16:30.030774 105 logging.cc:49] Loaded engine size: 208 MiB
I0823 19:16:30.171583 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3621, GPU 6359 (MiB)
I0823 19:16:30.172529 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3621, GPU 6367 (MiB)
I0823 19:16:30.176997 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +94, now: CPU 1, GPU 1748 (MiB)
I0823 19:16:30.197097 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3205, GPU 6359 (MiB)
I0823 19:16:30.197859 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3205, GPU 6367 (MiB)
I0823 19:16:30.286596 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +108, now: CPU 1, GPU 1856 (MiB)
I0823 19:16:30.286971 105 tensorrt.cc:1409] Created instance riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0823 19:16:30.287213 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-riva-punctuation-en-US-nn-bert-base-uncased' version 1
I0823 19:16:30.287909 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline:1
I0823 19:16:30.388194 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming:1
I0823 19:16:30.488459 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline' version 1
I0823 19:16:30.488743 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming' version 1
I0823 19:16:30.488885 105 server.cc:522] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0823 19:16:30.489027 105 server.cc:549] 
+-------------------+-----------------------------------------------------------------------------+--------+
| Backend           | Path                                                                        | Config |
+-------------------+-----------------------------------------------------------------------------+--------+
| onnxruntime       | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so             | {}     |
| riva_asr_decoder  | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so   | {}     |
| tensorrt          | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so                   | {}     |
| riva_asr_vad      | /opt/tritonserver/backends/riva_asr_vad/libtriton_riva_asr_vad.so           | {}     |
| riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {}     |
| riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {}     |
+-------------------+-----------------------------------------------------------------------------+--------+

I0823 19:16:30.489243 105 server.cc:592] 
+-------------------------------------------------------------------------+---------+--------+
| Model                                                                   | Version | Status |
+-------------------------------------------------------------------------+---------+--------+
| citrinet-1024-en-US-asr-offline                                         | 1       | READY  |
| citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline                 | 1       | READY  |
| citrinet-1024-en-US-asr-offline-feature-extractor-offline               | 1       | READY  |
| citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline     | 1       | READY  |
| citrinet-1024-en-US-asr-streaming                                       | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming             | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-feature-extractor-streaming           | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming | 1       | READY  |
| riva-punctuation-en-US                                                  | 1       | READY  |
| riva-trt-citrinet-1024-en-US-asr-offline-am-offline                     | 1       | READY  |
| riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming                 | 1       | READY  |
| riva-trt-riva-punctuation-en-US-nn-bert-base-uncased                    | 1       | READY  |
+-------------------------------------------------------------------------+---------+--------+

I0823 19:16:30.509922 105 metrics.cc:623] Collecting metrics for GPU 0: GRID A100D-8C
I0823 19:16:30.510725 105 tritonserver.cc:1932] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                        |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                       |
| server_version                   | 2.19.0                                                                                                                                                                                       |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0]         | /data/models                                                                                                                                                                                 |
| model_control_mode               | MODE_NONE                                                                                                                                                                                    |
| strict_model_config              | 1                                                                                                                                                                                            |
| rate_limit                       | OFF                                                                                                                                                                                          |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
| cuda_memory_pool_byte_size{0}    | 1000000000                                                                                                                                                                                   |
| response_cache_byte_size         | 0                                                                                                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                                                                                          |
| strict_readiness                 | 1                                                                                                                                                                                            |
| exit_timeout                     | 30                                                                                                                                                                                           |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0823 19:16:30.528944 105 grpc_server.cc:4375] Started GRPCInferenceService at 0.0.0.0:8001
I0823 19:16:30.530417 105 http_server.cc:3075] Started HTTPService at 0.0.0.0:8000
I0823 19:16:30.571541 105 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
  > Triton server is ready...
I0823 19:16:30.920017   269 riva_server.cc:118] Using Insecure Server Credentials
I0823 19:16:30.929585   269 model_registry.cc:112] Successfully registered: citrinet-1024-en-US-asr-offline for ASR
I0823 19:16:30.933429   269 model_registry.cc:112] Successfully registered: citrinet-1024-en-US-asr-streaming for ASR
I0823 19:16:30.954844   269 model_registry.cc:112] Successfully registered: riva-punctuation-en-US for NLP
I0823 19:16:31.025720   269 riva_server.cc:158] Riva Conversational AI Server listening on 0.0.0.0:50051
W0823 19:16:31.025817   269 stats_reporter.cc:41] No API key provided. Stats reporting disabled.
W0823 19:16:31.511423 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0823 19:16:31.511568 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0823 19:16:31.511646 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0823 19:16:32.511844 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0823 19:16:32.512048 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0823 19:16:32.512122 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0823 19:16:33.513288 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0823 19:16:33.513508 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0823 19:16:33.513590 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0

Looking forward to any further guideance.

Thank you again.

NSDB · August 24, 2022, 10:56pm

ps. once the server is started (and logged) I run the following example:

# bash riva_start_client.sh
# riva_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav

… the response is as follows:

#:~/riva_quickstart_v2-3-0# bash riva_start_client.sh

> Image nvcr.io/nvidia/riva/riva-speech-client:2.3.0 exists. Skipping pull.
>
>#::/work# riva_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav
>
>I0824 22:44:52.200430 10 riva_asr_client.cc:434] Using Insecure Server Credentials
>
>Error creating GRPC channel: Unable to establish connection to server. Current state: 3
>
>Exiting.

…at which point the logs are as follows:

:~/riva_quickstart_v2-3-0#  docker logs riva-speech

==========================
=== Riva Speech Skills ===
==========================

NVIDIA Release 22.06 (build 40051835)
Riva Speech Server Version 2.3.0

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for Riva Speech Server.  NVIDIA recommends the use of the following flags:
   docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...

  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:35.653505 105 onnxruntime.cc:2319] TRITONBACKEND_Initialize: onnxruntime
I0824 20:43:35.653694 105 onnxruntime.cc:2329] Triton TRITONBACKEND API version: 1.8
I0824 20:43:35.653799 105 onnxruntime.cc:2335] 'onnxruntime' TRITONBACKEND API version: 1.8
I0824 20:43:35.653877 105 onnxruntime.cc:2365] backend configuration:
{}
I0824 20:43:36.116755 105 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x10020000000' with size 268435456
I0824 20:43:36.117490 105 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000
I0824 20:43:36.125940 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline:1
I0824 20:43:36.226316 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-feature-extractor-offline:1
I0824 20:43:36.326576 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline:1
I0824 20:43:36.330282 105 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0824 20:43:36.332993   111 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0824 20:43:36.333122   111 parameter_parser.cc:121] Default value will be used
W0824 20:43:36.333246   111 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0824 20:43:36.333292   111 parameter_parser.cc:121] Default value will be used
W0824 20:43:36.333326   111 parameter_parser.cc:120] Parameter max_num_slots could not be set from parameters
W0824 20:43:36.333382   111 parameter_parser.cc:121] Default value will be used
I0824 20:43:36.334349 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",
    "platform": "",
    "backend": "riva_asr_decoder",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 128,
    "input": [
        {
            "name": "CLASS_LOGITS",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                1025
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "END_FLAG",
            "data_type": "TYPE_UINT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "SEGMENTS_START_END",
            "data_type": "TYPE_INT32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                2
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "CUSTOM_CONFIGURATION",
            "data_type": "TYPE_STRING",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                2
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "FINAL_TRANSCRIPTS",
            "data_type": "TYPE_STRING",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "FINAL_TRANSCRIPTS_SCORE",
            "data_type": "TYPE_FP32",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "FINAL_WORDS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_TRANSCRIPTS",
            "data_type": "TYPE_STRING",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_TRANSCRIPTS_STABILITY",
            "data_type": "TYPE_FP32",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_WORDS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 128,
            "preferred_batch_size": [
                32,
                64
            ],
            "max_queue_delay_microseconds": 1000
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "decoder_type": {
            "string_value": "flashlight"
        },
        "forerunner_beam_size": {
            "string_value": "8"
        },
        "max_supported_transcripts": {
            "string_value": "1"
        },
        "chunk_size": {
            "string_value": "300.0"
        },
        "lexicon_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/lexicon.txt"
        },
        "smearing_mode": {
            "string_value": "max"
        },
        "use_vad": {
            "string_value": "True"
        },
        "blank_token": {
            "string_value": "#"
        },
        "lm_weight": {
            "string_value": "0.2"
        },
        "vocab_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/riva_decoder_vocabulary.txt"
        },
        "ms_per_timestep": {
            "string_value": "80"
        },
        "use_subword": {
            "string_value": "True"
        },
        "streaming": {
            "string_value": "False"
        },
        "beam_size": {
            "string_value": "16"
        },
        "right_padding_size": {
            "string_value": "0.0"
        },
        "beam_size_token": {
            "string_value": "16"
        },
        "sil_token": {
            "string_value": "▁"
        },
        "num_tokenization": {
            "string_value": "1"
        },
        "beam_threshold": {
            "string_value": "20.0"
        },
        "language_model_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/jarvis_asr_train_datasets_noSpgi_noLS_gt_3gram.binary"
        },
        "tokenizer_model": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/498056ba420d4bb3831ad557fba06032_tokenizer.model"
        },
        "max_execution_batch_size": {
            "string_value": "1024"
        },
        "forerunner_use_lm": {
            "string_value": "true"
        },
        "forerunner_beam_size_token": {
            "string_value": "8"
        },
        "forerunner_beam_threshold": {
            "string_value": "10.0"
        },
        "decoder_num_worker_threads": {
            "string_value": "-1"
        },
        "asr_model_delay": {
            "string_value": "-1"
        },
        "word_insertion_score": {
            "string_value": "0.2"
        },
        "left_padding_size": {
            "string_value": "0.0"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0824 20:43:36.337936 105 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline_0 (device 0)
I0824 20:43:36.426915 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming:1
I0824 20:43:36.527275 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming:1
I0824 20:43:36.627636 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming:1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:36.728011 105 model_repository_manager.cc:994] loading: riva-punctuation-en-US:1
I0824 20:43:36.828379 105 model_repository_manager.cc:994] loading: riva-trt-citrinet-1024-en-US-asr-offline-am-offline:1
I0824 20:43:36.928752 105 model_repository_manager.cc:994] loading: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming:1
I0824 20:43:37.029122 105 model_repository_manager.cc:994] loading: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased:1
I0824 20:43:37.267638   111 ctc-decoder.cc:171] Beam Decoder initialized successfully!
I0824 20:43:37.267820 105 feature-extractor.cc:407] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-feature-extractor-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0824 20:43:37.269563   112 parameter_parser.cc:120] Parameter is_dither_seed_random could not be set from parameters
W0824 20:43:37.269744   112 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.269779   112 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0824 20:43:37.269932   112 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.269964   112 parameter_parser.cc:120] Parameter max_sequence_idle_microseconds could not be set from parameters
W0824 20:43:37.270023   112 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.270084   112 parameter_parser.cc:120] Parameter preemph_coeff could not be set from parameters
W0824 20:43:37.270128   112 parameter_parser.cc:121] Default value will be used
I0824 20:43:37.269880 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline' version 1
I0824 20:43:37.386834 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-offline-feature-extractor-offline",
    "platform": "",
    "backend": "riva_asr_features",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 1,
    "input": [
        {
            "name": "AUDIO_SIGNAL",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "SAMPLE_RATE",
            "data_type": "TYPE_UINT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "AUDIO_FEATURES",
            "data_type": "TYPE_FP32",
            "dims": [
                80,
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "AUDIO_PROCESSED",
            "data_type": "TYPE_FP32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 1,
            "preferred_batch_size": [
                1
            ],
            "max_queue_delay_microseconds": 1000
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-offline-feature-extractor-offline_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "streaming": {
            "string_value": "False"
        },
        "stddev_floor": {
            "string_value": "1e-05"
        },
        "transpose": {
            "string_value": "False"
        },
        "left_padding_size": {
            "string_value": "0.0"
        },
        "right_padding_size": {
            "string_value": "0.0"
        },
        "gain": {
            "string_value": "1.0"
        },
        "precalc_norm_time_steps": {
            "string_value": "0"
        },
        "use_utterance_norm_params": {
            "string_value": "False"
        },
        "precalc_norm_params": {
            "string_value": "False"
        },
        "dither": {
            "string_value": "0.0"
        },
        "norm_per_feature": {
            "string_value": "True"
        },
        "mean": {
            "string_value": "-11.4412,  -9.9334,  -9.1292,  -9.0365,  -9.2804,  -9.5643,  -9.7342, -9.6925,  -9.6333,  -9.2808,  -9.1887,  -9.1422,  -9.1397,  -9.2028, -9.2749,  -9.4776,  -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734,  -9.9257,  -9.6557,  -9.1761, -9.6653,  -9.7876,  -9.7230,  -9.7792,  -9.7056,  -9.2702,  -9.4650, -9.2755,  -9.1369,  -9.1174,  -8.9197,  -8.5394,  -8.2614,  -8.1353, -8.1422,  -8.3430,  -8.6655"
        },
        "stddev": {
            "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181"
        },
        "chunk_size": {
            "string_value": "300.0"
        },
        "max_execution_batch_size": {
            "string_value": "1"
        },
        "sample_rate": {
            "string_value": "16000"
        },
        "num_features": {
            "string_value": "80"
        },
        "window_size": {
            "string_value": "0.025"
        },
        "window_stride": {
            "string_value": "0.01"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0824 20:43:37.389216 105 vad_library.cc:18] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0824 20:43:37.389927   113 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0824 20:43:37.390015   113 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.390098   113 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0824 20:43:37.390144   113 parameter_parser.cc:121] Default value will be used
I0824 20:43:37.390689 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline",
    "platform": "",
    "backend": "riva_asr_vad",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 2048,
    "input": [
        {
            "name": "CLASS_LOGITS",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                1025
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "SEGMENTS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "residue_blanks_at_end": {
            "string_value": "0"
        },
        "vad_stop_history": {
            "string_value": "800"
        },
        "vad_start_history": {
            "string_value": "300"
        },
        "chunk_size": {
            "string_value": "300.0"
        },
        "vad_start_th": {
            "string_value": "0.2"
        },
        "vad_stop_th": {
            "string_value": "0.98"
        },
        "vad_type": {
            "string_value": "ctc-vad"
        },
        "vocab_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline/1/riva_decoder_vocabulary.txt"
        },
        "ms_per_timestep": {
            "string_value": "80"
        },
        "residue_blanks_at_start": {
            "string_value": "0"
        },
        "streaming": {
            "string_value": "False"
        },
        "use_subword": {
            "string_value": "True"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0824 20:43:37.391235 105 feature-extractor.cc:407] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming (version 1)
W0824 20:43:37.391907   117 parameter_parser.cc:120] Parameter is_dither_seed_random could not be set from parameters
W0824 20:43:37.391966   117 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.392016   117 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0824 20:43:37.392068   117 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.392100   117 parameter_parser.cc:120] Parameter max_sequence_idle_microseconds could not be set from parameters
W0824 20:43:37.392153   117 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.392189   117 parameter_parser.cc:120] Parameter preemph_coeff could not be set from parameters
W0824 20:43:37.392241   117 parameter_parser.cc:121] Default value will be used
I0824 20:43:37.392430 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",
    "platform": "",
    "backend": "riva_asr_features",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 1024,
    "input": [
        {
            "name": "AUDIO_SIGNAL",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "SAMPLE_RATE",
            "data_type": "TYPE_UINT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "AUDIO_FEATURES",
            "data_type": "TYPE_FP32",
            "dims": [
                80,
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "AUDIO_PROCESSED",
            "data_type": "TYPE_FP32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 1024,
            "preferred_batch_size": [
                256,
                512
            ],
            "max_queue_delay_microseconds": 1000
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-streaming-feature-extractor-streaming_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "max_execution_batch_size": {
            "string_value": "1024"
        },
        "sample_rate": {
            "string_value": "16000"
        },
        "window_stride": {
            "string_value": "0.01"
        },
        "window_size": {
            "string_value": "0.025"
        },
        "num_features": {
            "string_value": "80"
        },
        "streaming": {
            "string_value": "True"
        },
        "transpose": {
            "string_value": "False"
        },
        "stddev_floor": {
            "string_value": "1e-05"
        },
        "left_padding_size": {
            "string_value": "1.92"
        },
        "right_padding_size": {
            "string_value": "1.92"
        },
        "gain": {
            "string_value": "1.0"
        },
        "precalc_norm_time_steps": {
            "string_value": "0"
        },
        "use_utterance_norm_params": {
            "string_value": "False"
        },
        "dither": {
            "string_value": "1e-05"
        },
        "precalc_norm_params": {
            "string_value": "False"
        },
        "norm_per_feature": {
            "string_value": "True"
        },
        "mean": {
            "string_value": "-11.4412,  -9.9334,  -9.1292,  -9.0365,  -9.2804,  -9.5643,  -9.7342, -9.6925,  -9.6333,  -9.2808,  -9.1887,  -9.1422,  -9.1397,  -9.2028, -9.2749,  -9.4776,  -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734,  -9.9257,  -9.6557,  -9.1761, -9.6653,  -9.7876,  -9.7230,  -9.7792,  -9.7056,  -9.2702,  -9.4650, -9.2755,  -9.1369,  -9.1174,  -8.9197,  -8.5394,  -8.2614,  -8.1353, -8.1422,  -8.3430,  -8.6655"
        },
        "stddev": {
            "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181"
        },
        "chunk_size": {
            "string_value": "0.16"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0824 20:43:37.393238 105 vad_library.cc:18] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming (version 1)
W0824 20:43:37.393733   120 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0824 20:43:37.393787   120 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.393852   120 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0824 20:43:37.393901   120 parameter_parser.cc:121] Default value will be used
I0824 20:43:37.394317 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming",
    "platform": "",
    "backend": "riva_asr_vad",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 2048,
    "input": [
        {
            "name": "CLASS_LOGITS",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                1025
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "SEGMENTS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "use_subword": {
            "string_value": "True"
        },
        "streaming": {
            "string_value": "True"
        },
        "residue_blanks_at_end": {
            "string_value": "0"
        },
        "vad_stop_history": {
            "string_value": "800"
        },
        "vad_start_history": {
            "string_value": "300"
        },
        "chunk_size": {
            "string_value": "0.16"
        },
        "vad_stop_th": {
            "string_value": "0.98"
        },
        "vad_start_th": {
            "string_value": "0.2"
        },
        "vad_type": {
            "string_value": "ctc-vad"
        },
        "vocab_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming/1/riva_decoder_vocabulary.txt"
        },
        "residue_blanks_at_start": {
            "string_value": "-2"
        },
        "ms_per_timestep": {
            "string_value": "80"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0824 20:43:37.418230 105 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1)
W0824 20:43:37.418959   115 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0824 20:43:37.419021   115 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.419170   115 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0824 20:43:37.419217   115 parameter_parser.cc:121] Default value will be used
W0824 20:43:37.419251   115 parameter_parser.cc:120] Parameter max_num_slots could not be set from parameters
W0824 20:43:37.419306   115 parameter_parser.cc:121] Default value will be used
I0824 20:43:37.419782 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",
    "platform": "",
    "backend": "riva_asr_decoder",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 1024,
    "input": [
        {
            "name": "CLASS_LOGITS",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                1025
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "END_FLAG",
            "data_type": "TYPE_UINT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "SEGMENTS_START_END",
            "data_type": "TYPE_INT32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                2
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "CUSTOM_CONFIGURATION",
            "data_type": "TYPE_STRING",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                2
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "FINAL_TRANSCRIPTS",
            "data_type": "TYPE_STRING",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "FINAL_TRANSCRIPTS_SCORE",
            "data_type": "TYPE_FP32",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "FINAL_WORDS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_TRANSCRIPTS",
            "data_type": "TYPE_STRING",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_TRANSCRIPTS_STABILITY",
            "data_type": "TYPE_FP32",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_WORDS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 1024,
            "preferred_batch_size": [
                32,
                64
            ],
            "max_queue_delay_microseconds": 1000
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "lexicon_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt"
        },
        "smearing_mode": {
            "string_value": "max"
        },
        "use_vad": {
            "string_value": "True"
        },
        "lm_weight": {
            "string_value": "0.2"
        },
        "blank_token": {
            "string_value": "#"
        },
        "vocab_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt"
        },
        "ms_per_timestep": {
            "string_value": "80"
        },
        "streaming": {
            "string_value": "True"
        },
        "use_subword": {
            "string_value": "True"
        },
        "beam_size": {
            "string_value": "16"
        },
        "right_padding_size": {
            "string_value": "1.92"
        },
        "beam_size_token": {
            "string_value": "16"
        },
        "sil_token": {
            "string_value": "▁"
        },
        "num_tokenization": {
            "string_value": "1"
        },
        "beam_threshold": {
            "string_value": "20.0"
        },
        "tokenizer_model": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/498056ba420d4bb3831ad557fba06032_tokenizer.model"
        },
        "language_model_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/jarvis_asr_train_datasets_noSpgi_noLS_gt_3gram.binary"
        },
        "max_execution_batch_size": {
            "string_value": "1024"
        },
        "forerunner_use_lm": {
            "string_value": "true"
        },
        "forerunner_beam_size_token": {
            "string_value": "8"
        },
        "forerunner_beam_threshold": {
            "string_value": "10.0"
        },
        "decoder_num_worker_threads": {
            "string_value": "-1"
        },
        "asr_model_delay": {
            "string_value": "-1"
        },
        "word_insertion_score": {
            "string_value": "0.2"
        },
        "left_padding_size": {
            "string_value": "1.92"
        },
        "decoder_type": {
            "string_value": "flashlight"
        },
        "forerunner_beam_size": {
            "string_value": "8"
        },
        "chunk_size": {
            "string_value": "0.16"
        },
        "max_supported_transcripts": {
            "string_value": "1"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0824 20:43:37.420774 105 feature-extractor.cc:409] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-feature-extractor-offline_0 (device 0)
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:40.333774 105 feature-extractor.cc:409] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming_0 (device 0)
I0824 20:43:40.339803 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-feature-extractor-offline' version 1
I0824 20:43:40.352678 105 vad_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming_0 (device 0)
I0824 20:43:40.364144 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-feature-extractor-streaming' version 1
I0824 20:43:40.468771 105 vad_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline_0 (device 0)
I0824 20:43:40.476321 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming' version 1
I0824 20:43:40.585404 105 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0)
I0824 20:43:40.592618 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:41.459676   115 ctc-decoder.cc:171] Beam Decoder initialized successfully!
I0824 20:43:41.460318 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1
I0824 20:43:41.518741 105 pipeline_library.cc:19] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0824 20:43:41.519357   121 parameter_parser.cc:120] Parameter bos could not be set from parameters
W0824 20:43:41.519454   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.519512   121 parameter_parser.cc:120] Parameter dropout_prob could not be set from parameters
W0824 20:43:41.519543   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.519609   121 parameter_parser.cc:120] Parameter eos could not be set from parameters
W0824 20:43:41.519645   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.519690   121 parameter_parser.cc:120] Parameter reverse could not be set from parameters
W0824 20:43:41.519729   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.519809   121 parameter_parser.cc:120] Parameter bos could not be set from parameters
W0824 20:43:41.519850   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.519884   121 parameter_parser.cc:120] Parameter doc_stride could not be set from parameters
W0824 20:43:41.519932   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.519975   121 parameter_parser.cc:120] Parameter dropout_prob could not be set from parameters
W0824 20:43:41.520018   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.520066   121 parameter_parser.cc:120] Parameter eos could not be set from parameters
W0824 20:43:41.520097   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.520144   121 parameter_parser.cc:120] Parameter margin could not be set from parameters
W0824 20:43:41.520179   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.520221   121 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0824 20:43:41.520260   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.520290   121 parameter_parser.cc:120] Parameter max_query_length could not be set from parameters
W0824 20:43:41.520337   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.520370   121 parameter_parser.cc:120] Parameter max_seq_length could not be set from parameters
W0824 20:43:41.520418   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.520460   121 parameter_parser.cc:120] Parameter reverse could not be set from parameters
W0824 20:43:41.520491   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.520538   121 parameter_parser.cc:120] Parameter step could not be set from parameters
W0824 20:43:41.520572   121 parameter_parser.cc:121] Default value will be used
W0824 20:43:41.520617   121 parameter_parser.cc:120] Parameter task could not be set from parameters
W0824 20:43:41.520654   121 parameter_parser.cc:121] Default value will be used
I0824 20:43:41.520752 105 backend_model.cc:255] model configuration:
{
    "name": "riva-punctuation-en-US",
    "platform": "",
    "backend": "riva_nlp_pipeline",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 8,
    "input": [
        {
            "name": "PIPELINE_INPUT",
            "data_type": "TYPE_STRING",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "PIPELINE_OUTPUT",
            "data_type": "TYPE_STRING",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "instance_group": [
        {
            "name": "riva-punctuation-en-US_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "remove_spaces": {
            "string_value": "False"
        },
        "tokenizer_to_lower": {
            "string_value": "true"
        },
        "unk_token": {
            "string_value": "[UNK]"
        },
        "model_family": {
            "string_value": "riva"
        },
        "vocab": {
            "string_value": "/data/models/riva-punctuation-en-US/1/tokenizer.vocab_file"
        },
        "capit_logits_tensor_name": {
            "string_value": "capit_token_logits"
        },
        "bos_token": {
            "string_value": "[CLS]"
        },
        "punctuation_mapping_path": {
            "string_value": "/data/models/riva-punctuation-en-US/1/punct_label_ids.csv"
        },
        "model_api": {
            "string_value": "/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText"
        },
        "to_lower": {
            "string_value": "true"
        },
        "pipeline_type": {
            "string_value": "punctuation"
        },
        "capitalization_mapping_path": {
            "string_value": "/data/models/riva-punctuation-en-US/1/capit_label_ids.csv"
        },
        "eos_token": {
            "string_value": "[SEP]"
        },
        "load_model": {
            "string_value": "false"
        },
        "attn_mask_tensor_name": {
            "string_value": "input_mask"
        },
        "token_type_tensor_name": {
            "string_value": "segment_ids"
        },
        "punct_logits_tensor_name": {
            "string_value": "punct_token_logits"
        },
        "language_code": {
            "string_value": "en-US"
        },
        "tokenizer": {
            "string_value": "wordpiece"
        },
        "delimiter": {
            "string_value": " "
        },
        "input_ids_tensor_name": {
            "string_value": "input_ids"
        },
        "model_name": {
            "string_value": "riva-trt-riva-punctuation-en-US-nn-bert-base-uncased"
        },
        "pad_chars_with_spaces": {
            "string_value": "False"
        }
    },
    "model_warmup": []
}
I0824 20:43:41.521419 105 pipeline_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0)
I0824 20:43:41.531929 105 model_repository_manager.cc:1149] successfully loaded 'riva-punctuation-en-US' version 1
I0824 20:43:41.532389 105 tensorrt.cc:5145] TRITONBACKEND_Initialize: tensorrt
I0824 20:43:41.532470 105 tensorrt.cc:5155] Triton TRITONBACKEND API version: 1.8
I0824 20:43:41.532553 105 tensorrt.cc:5161] 'tensorrt' TRITONBACKEND API version: 1.8
I0824 20:43:41.533089 105 tensorrt.cc:5204] backend configuration:
{}
I0824 20:43:41.533247 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming (version 1)
I0824 20:43:41.533987 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming_0 (GPU device 0)
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:43.039787 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +417, GPU +0, now: CPU 2052, GPU 3599 (MiB)
I0824 20:43:43.365179 105 logging.cc:49] Loaded engine size: 278 MiB
I0824 20:43:43.592837 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2619, GPU 3885 (MiB)
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:43.859839 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +126, GPU +58, now: CPU 2745, GPU 3943 (MiB)
I0824 20:43:43.862610 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +276, now: CPU 0, GPU 276 (MiB)
I0824 20:43:43.894337 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2187, GPU 3935 (MiB)
I0824 20:43:43.895219 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2187, GPU 3943 (MiB)
I0824 20:43:44.003910 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +532, now: CPU 0, GPU 808 (MiB)
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:46.173660 105 tensorrt.cc:1409] Created instance riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0824 20:43:46.173863 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-citrinet-1024-en-US-asr-offline-am-offline (version 1)
I0824 20:43:46.174502 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-citrinet-1024-en-US-asr-offline-am-offline_0 (GPU device 0)
I0824 20:43:46.175065 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3066, GPU 5090 (MiB)
I0824 20:43:46.179752 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming' version 1
I0824 20:43:46.700815 105 logging.cc:49] Loaded engine size: 283 MiB
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:46.924879 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +8, now: CPU 3642, GPU 5384 (MiB)
I0824 20:43:46.925878 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3642, GPU 5392 (MiB)
I0824 20:43:46.927453 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +282, now: CPU 0, GPU 1090 (MiB)
I0824 20:43:46.952941 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3074, GPU 5384 (MiB)
I0824 20:43:46.953826 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3074, GPU 5392 (MiB)
I0824 20:43:46.960739 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +564, now: CPU 1, GPU 1654 (MiB)
I0824 20:43:46.961876 105 tensorrt.cc:1409] Created instance riva-trt-citrinet-1024-en-US-asr-offline-am-offline_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0824 20:43:46.961995 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased (version 1)
I0824 20:43:46.962606 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 (GPU device 0)
I0824 20:43:46.963082 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3076, GPU 6010 (MiB)
I0824 20:43:46.971646 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-citrinet-1024-en-US-asr-offline-am-offline' version 1
I0824 20:43:47.204674 105 logging.cc:49] Loaded engine size: 208 MiB
I0824 20:43:47.544107 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3621, GPU 6360 (MiB)
I0824 20:43:47.545068 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3621, GPU 6368 (MiB)
I0824 20:43:47.545928 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +95, now: CPU 1, GPU 1749 (MiB)
I0824 20:43:47.565143 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3205, GPU 6360 (MiB)
I0824 20:43:47.565913 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3205, GPU 6368 (MiB)
I0824 20:43:47.649768 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +107, now: CPU 1, GPU 1856 (MiB)
I0824 20:43:47.650121 105 tensorrt.cc:1409] Created instance riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0824 20:43:47.650376 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-riva-punctuation-en-US-nn-bert-base-uncased' version 1
I0824 20:43:47.651060 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline:1
I0824 20:43:47.751354 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming:1
I0824 20:43:47.851532 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline' version 1
I0824 20:43:47.851705 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming' version 1
I0824 20:43:47.851754 105 server.cc:522] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0824 20:43:47.851794 105 server.cc:549] 
+-------------------+-----------------------------------------------------------------------------+--------+
| Backend           | Path                                                                        | Config |
+-------------------+-----------------------------------------------------------------------------+--------+
| onnxruntime       | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so             | {}     |
| riva_asr_decoder  | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so   | {}     |
| tensorrt          | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so                   | {}     |
| riva_asr_vad      | /opt/tritonserver/backends/riva_asr_vad/libtriton_riva_asr_vad.so           | {}     |
| riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {}     |
| riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {}     |
+-------------------+-----------------------------------------------------------------------------+--------+

I0824 20:43:47.851852 105 server.cc:592] 
+-------------------------------------------------------------------------+---------+--------+
| Model                                                                   | Version | Status |
+-------------------------------------------------------------------------+---------+--------+
| citrinet-1024-en-US-asr-offline                                         | 1       | READY  |
| citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline                 | 1       | READY  |
| citrinet-1024-en-US-asr-offline-feature-extractor-offline               | 1       | READY  |
| citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline     | 1       | READY  |
| citrinet-1024-en-US-asr-streaming                                       | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming             | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-feature-extractor-streaming           | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming | 1       | READY  |
| riva-punctuation-en-US                                                  | 1       | READY  |
| riva-trt-citrinet-1024-en-US-asr-offline-am-offline                     | 1       | READY  |
| riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming                 | 1       | READY  |
| riva-trt-riva-punctuation-en-US-nn-bert-base-uncased                    | 1       | READY  |
+-------------------------------------------------------------------------+---------+--------+

  > Riva waiting for Triton server to load all models...retrying in 1 second
I0824 20:43:47.878017 105 metrics.cc:623] Collecting metrics for GPU 0: GRID A100D-8C
I0824 20:43:47.878825 105 tritonserver.cc:1932] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                        |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                       |
| server_version                   | 2.19.0                                                                                                                                                                                       |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0]         | /data/models                                                                                                                                                                                 |
| model_control_mode               | MODE_NONE                                                                                                                                                                                    |
| strict_model_config              | 1                                                                                                                                                                                            |
| rate_limit                       | OFF                                                                                                                                                                                          |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
| cuda_memory_pool_byte_size{0}    | 1000000000                                                                                                                                                                                   |
| response_cache_byte_size         | 0                                                                                                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                                                                                          |
| strict_readiness                 | 1                                                                                                                                                                                            |
| exit_timeout                     | 30                                                                                                                                                                                           |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0824 20:43:47.885654 105 grpc_server.cc:4375] Started GRPCInferenceService at 0.0.0.0:8001
I0824 20:43:47.886959 105 http_server.cc:3075] Started HTTPService at 0.0.0.0:8000
I0824 20:43:47.928470 105 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
W0824 20:43:48.879994 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0824 20:43:48.880129 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0824 20:43:48.880204 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
  > Triton server is ready...
I0824 20:43:48.911239   265 riva_server.cc:115] Using SSL Credentials
I0824 20:43:48.920466   265 model_registry.cc:112] Successfully registered: citrinet-1024-en-US-asr-offline for ASR
I0824 20:43:48.924527   265 model_registry.cc:112] Successfully registered: citrinet-1024-en-US-asr-streaming for ASR
I0824 20:43:48.945958   265 model_registry.cc:112] Successfully registered: riva-punctuation-en-US for NLP
I0824 20:43:49.022421   265 riva_server.cc:158] Riva Conversational AI Server listening on 0.0.0.0:50051
W0824 20:43:49.022506   265 stats_reporter.cc:41] No API key provided. Stats reporting disabled.
W0824 20:43:49.880385 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0824 20:43:49.880554 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0824 20:43:49.880638 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0824 20:43:50.881815 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0824 20:43:50.882095 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0824 20:43:50.882224 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
E0824 20:56:08.798478457     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:08.799455437     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:09.794694026     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:09.798609973     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:11.315879173     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:11.638850783     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:14.097094392     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:14.323821403     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:18.362038304     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 20:56:18.382665533     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:44:52.202366841     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:44:52.203141377     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:44:53.201508354     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:44:53.203266876     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:44:54.626532527     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:44:54.691309101     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:44:57.297528210     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:44:57.335285301     292 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:45:01.886540017     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.
E0824 22:45:01.887226251     272 ssl_transport_security.cc:1246] Handshake failed with fatal error SSL_ERROR_SSL: error:1408F10B:SSL routines:ssl3_get_record:wrong version number.

Thank you again for your time and attention,

rvinobha · August 29, 2022, 5:44pm

Hi @NSDB

Thanks for sharing the requested logs,

In the last log, we find SSL related error, is this happening even after removing the ssl, bash riva_clean.sh and fresh install ?

We find I0824 20:43:48.911239 265 riva_server.cc:115] Using SSL Credentials in the last logs shared

Thanks

NSDB · August 29, 2022, 7:21pm

PART 1/2:

Hi rvinobha, thank you for the follow-up.

confirming config:

# Copyright (c) 2022, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

# Architecture of target platform. Supported architectures: amd64, arm64
riva_target_arch="amd64"

# Legacy arm64 platform to be enabled. Supported legacy platforms: xavier
riva_arm64_legacy_platform=""

# Enable or Disable Riva Services
service_enabled_asr=true
service_enabled_nlp=false
service_enabled_tts=false

# Enable Riva Enterprise
# If enrolled in Enterprise, enable Riva Enterprise by setting configuration
# here. You must explicitly acknowledge you have read and agree to the EULA.
# RIVA_API_KEY=<ngc api key>
# RIVA_API_NGC_ORG=<ngc organization>
# RIVA_EULA=accept

# Language code to fetch models of a specify language
# Currently only ASR supports languages other than English
# Supported language codes: en-US, de-DE, es-US, ru-RU, zh-CN, hi-IN
# for any language other than English, set service_enabled_nlp and service_enabled_tts to False
# for multiple languages enter space separated language codes.
language_code=("en-US")

# Specify one or more GPUs to use
# specifying more than one GPU is currently an experimental feature, and may result in undefined behaviours.
gpus_to_use="device=0"

# Specify the encryption key to use to deploy models
MODEL_DEPLOY_KEY="tlt_encode"

# Locations to use for storing models artifacts
#
# If an absolute path is specified, the data will be written to that location
# Otherwise, a docker volume will be used (default).
#
# riva_init.sh will create a `rmir` and `models` directory in the volume or
# path specified.
#
# RMIR ($riva_model_loc/rmir)
# Riva uses an intermediate representation (RMIR) for models
# that are ready to deploy but not yet fully optimized for deployment. Pretrained
# versions can be obtained from NGC (by specifying NGC models below) and will be
# downloaded to $riva_model_loc/rmir by `riva_init.sh`
#
# Custom models produced by NeMo or TLT and prepared using riva-build
# may also be copied manually to this location $(riva_model_loc/rmir).
#
# Models ($riva_model_loc/models)
# During the riva_init process, the RMIR files in $riva_model_loc/rmir
# are inspected and optimized for deployment. The optimized versions are
# stored in $riva_model_loc/models. The riva server exclusively uses these
# optimized versions.
riva_model_loc="riva-model-repo"

if [[ $riva_target_arch == "arm64" ]]; then
    riva_model_loc="`pwd`/model_repository"
fi

# The default RMIRs are downloaded from NGC by default in the above $riva_rmir_loc directory
# If you'd like to skip the download from NGC and use the existing RMIRs in the $riva_rmir_loc
# then set the below $use_existing_rmirs flag to true. You can also deploy your set of custom
# RMIRs by keeping them in the riva_rmir_loc dir and use this quickstart script with the
# below flag to deploy them all together.
use_existing_rmirs=false

# Ports to expose for Riva services
riva_speech_api_port="50051"

# NGC orgs
riva_ngc_org="nvidia"
riva_ngc_team="riva"
riva_ngc_image_version="2.3.0"
riva_ngc_model_version="2.3.0"

# Pre-built models listed below will be downloaded from NGC. If models already exist in $riva-rmir
# then models can be commented out to skip download from NGC

########## ASR MODELS ##########

models_asr=()

### Citrinet-1024 models
for lang_code in ${language_code[@]}; do
    modified_lang_code="${lang_code/-/_}"
    modified_lang_code=${modified_lang_code,,}
    if [[ $riva_target_arch == "arm64" ]]; then
      models_asr+=(
      ### Citrinet-1024 Streaming w/ CPU decoder, best latency configuration
          "${riva_ngc_org}/${riva_ngc_team}/models_asr_citrinet_1024_${modified_lang_code}_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
      )
    else
      models_asr+=(
      ### Citrinet-1024 Streaming w/ CPU decoder, best latency configuration
          "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_str:${riva_ngc_model_version}"

      ### Citrinet-1024 Streaming w/ CPU decoder, best throughput configuration
      #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_str_thr:${riva_ngc_model_version}"

      ### Citrinet-1024 Offline w/ CPU decoder,
          "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_citrinet_1024_${modified_lang_code}_ofl:${riva_ngc_model_version}"
      )
    fi

    ### Punctuation model
    if [[ "${lang_code}"  == "en-US" || "${lang_code}" == "de-DE" || "${lang_code}" == "es-US" || "${lang_code}" == "zh-CN" ]]; then
      if [[ $riva_target_arch == "arm64" ]]; then
        models_asr+=(
            "${riva_ngc_org}/${riva_ngc_team}/models_nlp_punctuation_bert_base_${modified_lang_code}:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
        )
      else
        models_asr+=(
            "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_punctuation_bert_base_${modified_lang_code}:${riva_ngc_model_version}"
        )
      fi
    fi

done

#Other ASR models
if [[ $riva_target_arch == "arm64" ]]; then
  models_asr+=(
  ### Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_en_us_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### German Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_de_de_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### Spanish Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_es_us_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### Hindi Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_hi_in_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### Russian Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_conformer_ru_ru_str:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### Citrinet-256 Streaming w/ CPU decoder, best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/models_asr_citrinet_256_en_us_streaming:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
  )
else
  models_asr+=(
  ### Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_str:${riva_ngc_model_version}"

  ### Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_str_thr:${riva_ngc_model_version}"

  ### Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_en_us_ofl:${riva_ngc_model_version}"

  ### German Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_str:${riva_ngc_model_version}"

  ### German Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_str_thr:${riva_ngc_model_version}"

  ### German Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_de_de_ofl:${riva_ngc_model_version}"

  ### Spanish Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_str:${riva_ngc_model_version}"

  ### Spanish Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_str_thr:${riva_ngc_model_version}"

  ### Spanish Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_es_us_ofl:${riva_ngc_model_version}"

  ### Hindi Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_hi_in_str:${riva_ngc_model_version}"

  ### Hindi Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_hi_in_str_thr:${riva_ngc_model_version}"

  ### Hindi Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_hi_in_ofl:${riva_ngc_model_version}"
  
  ### Russian Conformer acoustic model, CPU decoder, streaming best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_ru_ru_str:${riva_ngc_model_version}"

  ### Russian Conformer acoustic model, CPU decoder, streaming best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_ru_ru_str_thr:${riva_ngc_model_version}"

  ### Russian Conformer acoustic model, CPU decoder, offline configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_conformer_ru_ru_ofl:${riva_ngc_model_version}"

  ### Jasper Streaming w/ CPU decoder, best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str:${riva_ngc_model_version}"

  ### Jasper Streaming w/ CPU decoder, best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_thr:${riva_ngc_model_version}"

  ###  Jasper Offline w/ CPU decoder
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_ofl:${riva_ngc_model_version}"

  ### Quarztnet Streaming w/ CPU decoder, best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_str:${riva_ngc_model_version}"

  ### Quarztnet Streaming w/ CPU decoder, best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_str_thr:${riva_ngc_model_version}"

  ### Quarztnet Offline w/ CPU decoder
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_quartznet_en_us_ofl:${riva_ngc_model_version}"

  ### Jasper Streaming w/ GPU decoder, best latency configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_gpu_decoder:${riva_ngc_model_version}"

  ### Jasper Streaming w/ GPU decoder, best throughput configuration
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_str_thr_gpu_decoder:${riva_ngc_model_version}"

  ### Jasper Offline w/ GPU decoder
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_asr_jasper_en_us_ofl_gpu_decoder:${riva_ngc_model_version}"
  )
fi

########## NLP MODELS ##########

if [[ $riva_target_arch == "arm64" ]]; then
  models_nlp=(
  ### BERT Base Intent Slot model for misty domain fine-tuned on weather, smalltalk/personality, poi/map datasets.
      "${riva_ngc_org}/${riva_ngc_team}/models_nlp_intent_slot_misty_bert_base:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"

  ### DistilBERT Intent Slot model for misty domain fine-tuned on weather, smalltalk/personality, poi/map datasets.
  #    "${riva_ngc_org}/${riva_ngc_team}/models_nlp_intent_slot_misty_distilbert:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
  )
else
  models_nlp=(
  ### Bert base Punctuation model
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_punctuation_bert_base_en_us:${riva_ngc_model_version}"

  ### BERT base Named Entity Recognition model fine-tuned on GMB dataset with class labels LOC, PER, ORG etc.
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_named_entity_recognition_bert_base:${riva_ngc_model_version}"

  ### BERT Base Intent Slot model fine-tuned on weather dataset.
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_intent_slot_bert_base:${riva_ngc_model_version}"

  ### BERT Base Question Answering model fine-tuned on Squad v2.
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_bert_base:${riva_ngc_model_version}"

  ### Megatron345M Question Answering model fine-tuned on Squad v2.
  #    "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_megatron:${riva_ngc_model_version}"

  ### Bert base Text Classification model fine-tuned on 4class (weather, meteorology, personality, nomatch) domain model.
      "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_text_classification_bert_base:${riva_ngc_model_version}"
  )
fi

########## TTS MODELS ##########

if [[ $riva_target_arch == "arm64" ]]; then
  models_tts=(
     "${riva_ngc_org}/${riva_ngc_team}/models_tts_fastpitch_hifigan_en_us_female_1:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
  #   "${riva_ngc_org}/${riva_ngc_team}/models_tts_fastpitch_hifigan_en_us_male_1:${riva_ngc_model_version}-${riva_target_arch}${riva_arm64_legacy_platform}"
  )
else
  models_tts=(
     "${riva_ngc_org}/${riva_ngc_team}/rmir_tts_fastpitch_hifigan_en_us_female_1:${riva_ngc_model_version}"
  #   "${riva_ngc_org}/${riva_ngc_team}/rmir_tts_fastpitch_hifigan_en_us_male_1:${riva_ngc_model_version}"
  )
fi

NGC_TARGET=${riva_ngc_org}
if [[ ! -z ${riva_ngc_team} ]]; then
  NGC_TARGET="${NGC_TARGET}/${riva_ngc_team}"
else
  team="\"\""
fi

# Specify paths to SSL Key and Certificate files to use TLS/SSL Credentials for a secured connection.
# If either are empty, an insecure connection will be used.
# Stored within container at /ssl/servert.crt and /ssl/server.key
# Optional, one can also specify a root certificate, stored within container at /ssl/root_server.crt
ssl_server_cert=""
ssl_server_key=""
ssl_root_cert=""

# define docker images required to run Riva
image_client="nvcr.io/${NGC_TARGET}/riva-speech-client:${riva_ngc_image_version}"
image_speech_api="nvcr.io/${NGC_TARGET}/riva-speech:${riva_ngc_image_version}-server"

# define docker images required to setup Riva
image_init_speech="nvcr.io/${NGC_TARGET}/riva-speech:${riva_ngc_image_version}-servicemaker"

# daemon names
riva_daemon_speech="riva-speech"
if [[ $riva_target_arch != "arm64" ]]; then
    riva_daemon_client="riva-client"
fi

confirming… processing:

# bash riva_clean.sh
# bash riva_init.sh
# bash riva_start.sh

…and resulting logs:

0# docker logs riva-speech

==========================
=== Riva Speech Skills ===
==========================

NVIDIA Release 22.06 (build 40051835)
Riva Speech Server Version 2.3.0

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for Riva Speech Server.  NVIDIA recommends the use of the following flags:
   docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...

  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:20.996861 105 onnxruntime.cc:2319] TRITONBACKEND_Initialize: onnxruntime
I0829 20:04:20.997597 105 onnxruntime.cc:2329] Triton TRITONBACKEND API version: 1.8
I0829 20:04:20.997681 105 onnxruntime.cc:2335] 'onnxruntime' TRITONBACKEND API version: 1.8
I0829 20:04:20.997749 105 onnxruntime.cc:2365] backend configuration:
{}
I0829 20:04:21.511777 105 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x10020000000' with size 268435456
I0829 20:04:21.512156 105 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000
I0829 20:04:21.517792 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline:1
I0829 20:04:21.618142 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-feature-extractor-offline:1
I0829 20:04:21.655977 105 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0829 20:04:21.657289   111 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0829 20:04:21.657415   111 parameter_parser.cc:121] Default value will be used
W0829 20:04:21.657531   111 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0829 20:04:21.657573   111 parameter_parser.cc:121] Default value will be used
W0829 20:04:21.657608   111 parameter_parser.cc:120] Parameter max_num_slots could not be set from parameters
W0829 20:04:21.657655   111 parameter_parser.cc:121] Default value will be used
I0829 20:04:21.658139 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",
    "platform": "",
    "backend": "riva_asr_decoder",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 128,
    "input": [
        {
            "name": "CLASS_LOGITS",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                1025
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "END_FLAG",
            "data_type": "TYPE_UINT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "SEGMENTS_START_END",
            "data_type": "TYPE_INT32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                2
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "CUSTOM_CONFIGURATION",
            "data_type": "TYPE_STRING",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                2
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "FINAL_TRANSCRIPTS",
            "data_type": "TYPE_STRING",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "FINAL_TRANSCRIPTS_SCORE",
            "data_type": "TYPE_FP32",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "FINAL_WORDS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_TRANSCRIPTS",
            "data_type": "TYPE_STRING",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_TRANSCRIPTS_STABILITY",
            "data_type": "TYPE_FP32",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_WORDS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 128,
            "preferred_batch_size": [
                32,
                64
            ],
            "max_queue_delay_microseconds": 1000
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "forerunner_beam_size_token": {
            "string_value": "8"
        },
        "forerunner_beam_threshold": {
            "string_value": "10.0"
        },
        "decoder_num_worker_threads": {
            "string_value": "-1"
        },
        "asr_model_delay": {
            "string_value": "-1"
        },
        "word_insertion_score": {
            "string_value": "0.2"
        },
        "left_padding_size": {
            "string_value": "0.0"
        },
        "decoder_type": {
            "string_value": "flashlight"
        },
        "forerunner_beam_size": {
            "string_value": "8"
        },
        "max_supported_transcripts": {
            "string_value": "1"
        },
        "chunk_size": {
            "string_value": "300.0"
        },
        "lexicon_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/lexicon.txt"
        },
        "smearing_mode": {
            "string_value": "max"
        },
        "use_vad": {
            "string_value": "True"
        },
        "lm_weight": {
            "string_value": "0.2"
        },
        "blank_token": {
            "string_value": "#"
        },
        "vocab_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/riva_decoder_vocabulary.txt"
        },
        "ms_per_timestep": {
            "string_value": "80"
        },
        "streaming": {
            "string_value": "False"
        },
        "use_subword": {
            "string_value": "True"
        },
        "beam_size": {
            "string_value": "16"
        },
        "right_padding_size": {
            "string_value": "0.0"
        },
        "beam_size_token": {
            "string_value": "16"
        },
        "sil_token": {
            "string_value": "▁"
        },
        "num_tokenization": {
            "string_value": "1"
        },
        "beam_threshold": {
            "string_value": "20.0"
        },
        "language_model_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/jarvis_asr_train_datasets_noSpgi_noLS_gt_3gram.binary"
        },
        "tokenizer_model": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/498056ba420d4bb3831ad557fba06032_tokenizer.model"
        },
        "max_execution_batch_size": {
            "string_value": "1024"
        },
        "forerunner_use_lm": {
            "string_value": "true"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0829 20:04:21.659291 105 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline_0 (device 0)
I0829 20:04:21.718440 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline:1
I0829 20:04:21.818839 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming:1
I0829 20:04:21.921363 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming:1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:22.021788 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming:1
I0829 20:04:22.122154 105 model_repository_manager.cc:994] loading: riva-punctuation-en-US:1
I0829 20:04:22.222552 105 model_repository_manager.cc:994] loading: riva-trt-citrinet-1024-en-US-asr-offline-am-offline:1
I0829 20:04:22.322946 105 model_repository_manager.cc:994] loading: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming:1
I0829 20:04:22.423352 105 model_repository_manager.cc:994] loading: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased:1
I0829 20:04:22.539752   111 ctc-decoder.cc:171] Beam Decoder initialized successfully!
I0829 20:04:22.540524 105 feature-extractor.cc:407] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-feature-extractor-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0829 20:04:22.541316   112 parameter_parser.cc:120] Parameter is_dither_seed_random could not be set from parameters
W0829 20:04:22.541436   112 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.541481   112 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0829 20:04:22.541538   112 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.541584   112 parameter_parser.cc:120] Parameter max_sequence_idle_microseconds could not be set from parameters
W0829 20:04:22.541615   112 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.541668   112 parameter_parser.cc:120] Parameter preemph_coeff could not be set from parameters
W0829 20:04:22.541703   112 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.547923 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline' version 1
I0829 20:04:22.691245 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-offline-feature-extractor-offline",
    "platform": "",
    "backend": "riva_asr_features",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 1,
    "input": [
        {
            "name": "AUDIO_SIGNAL",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "SAMPLE_RATE",
            "data_type": "TYPE_UINT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "AUDIO_FEATURES",
            "data_type": "TYPE_FP32",
            "dims": [
                80,
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "AUDIO_PROCESSED",
            "data_type": "TYPE_FP32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 1,
            "preferred_batch_size": [
                1
            ],
            "max_queue_delay_microseconds": 1000
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-offline-feature-extractor-offline_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "gain": {
            "string_value": "1.0"
        },
        "use_utterance_norm_params": {
            "string_value": "False"
        },
        "precalc_norm_time_steps": {
            "string_value": "0"
        },
        "precalc_norm_params": {
            "string_value": "False"
        },
        "dither": {
            "string_value": "0.0"
        },
        "norm_per_feature": {
            "string_value": "True"
        },
        "mean": {
            "string_value": "-11.4412,  -9.9334,  -9.1292,  -9.0365,  -9.2804,  -9.5643,  -9.7342, -9.6925,  -9.6333,  -9.2808,  -9.1887,  -9.1422,  -9.1397,  -9.2028, -9.2749,  -9.4776,  -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734,  -9.9257,  -9.6557,  -9.1761, -9.6653,  -9.7876,  -9.7230,  -9.7792,  -9.7056,  -9.2702,  -9.4650, -9.2755,  -9.1369,  -9.1174,  -8.9197,  -8.5394,  -8.2614,  -8.1353, -8.1422,  -8.3430,  -8.6655"
        },
        "stddev": {
            "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181"
        },
        "chunk_size": {
            "string_value": "300.0"
        },
        "max_execution_batch_size": {
            "string_value": "1"
        },
        "sample_rate": {
            "string_value": "16000"
        },
        "window_stride": {
            "string_value": "0.01"
        },
        "window_size": {
            "string_value": "0.025"
        },
        "num_features": {
            "string_value": "80"
        },
        "streaming": {
            "string_value": "False"
        },
        "left_padding_size": {
            "string_value": "0.0"
        },
        "stddev_floor": {
            "string_value": "1e-05"
        },
        "transpose": {
            "string_value": "False"
        },
        "right_padding_size": {
            "string_value": "0.0"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0829 20:04:22.692584 105 vad_library.cc:18] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0829 20:04:22.693289   113 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0829 20:04:22.693377   113 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.693456   113 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0829 20:04:22.693511   113 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.693940 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline",
    "platform": "",
    "backend": "riva_asr_vad",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 2048,
    "input": [
        {
            "name": "CLASS_LOGITS",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                1025
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "SEGMENTS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "chunk_size": {
            "string_value": "300.0"
        },
        "vad_start_th": {
            "string_value": "0.2"
        },
        "vad_stop_th": {
            "string_value": "0.98"
        },
        "vad_type": {
            "string_value": "ctc-vad"
        },
        "vocab_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline/1/riva_decoder_vocabulary.txt"
        },
        "residue_blanks_at_start": {
            "string_value": "0"
        },
        "ms_per_timestep": {
            "string_value": "80"
        },
        "streaming": {
            "string_value": "False"
        },
        "use_subword": {
            "string_value": "True"
        },
        "residue_blanks_at_end": {
            "string_value": "0"
        },
        "vad_stop_history": {
            "string_value": "800"
        },
        "vad_start_history": {
            "string_value": "300"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0829 20:04:22.694467 105 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1)
W0829 20:04:22.695189   114 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0829 20:04:22.695247   114 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.695338   114 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0829 20:04:22.695382   114 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.695416   114 parameter_parser.cc:120] Parameter max_num_slots could not be set from parameters
W0829 20:04:22.695467   114 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.695963 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",
    "platform": "",
    "backend": "riva_asr_decoder",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 1024,
    "input": [
        {
            "name": "CLASS_LOGITS",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                1025
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "END_FLAG",
            "data_type": "TYPE_UINT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "SEGMENTS_START_END",
            "data_type": "TYPE_INT32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                2
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "CUSTOM_CONFIGURATION",
            "data_type": "TYPE_STRING",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                2
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "FINAL_TRANSCRIPTS",
            "data_type": "TYPE_STRING",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "FINAL_TRANSCRIPTS_SCORE",
            "data_type": "TYPE_FP32",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "FINAL_WORDS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_TRANSCRIPTS",
            "data_type": "TYPE_STRING",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_TRANSCRIPTS_STABILITY",
            "data_type": "TYPE_FP32",
            "dims": [
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "PARTIAL_WORDS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 1024,
            "preferred_batch_size": [
                32,
                64
            ],
            "max_queue_delay_microseconds": 1000
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "forerunner_beam_size_token": {
            "string_value": "8"
        },
        "forerunner_beam_threshold": {
            "string_value": "10.0"
        },
        "asr_model_delay": {
            "string_value": "-1"
        },
        "decoder_num_worker_threads": {
            "string_value": "-1"
        },
        "word_insertion_score": {
            "string_value": "0.2"
        },
        "left_padding_size": {
            "string_value": "1.92"
        },
        "decoder_type": {
            "string_value": "flashlight"
        },
        "forerunner_beam_size": {
            "string_value": "8"
        },
        "chunk_size": {
            "string_value": "0.16"
        },
        "max_supported_transcripts": {
            "string_value": "1"
        },
        "lexicon_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt"
        },
        "smearing_mode": {
            "string_value": "max"
        },
        "use_vad": {
            "string_value": "True"
        },
        "lm_weight": {
            "string_value": "0.2"
        },
        "blank_token": {
            "string_value": "#"
        },
        "vocab_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt"
        },
        "ms_per_timestep": {
            "string_value": "80"
        },
        "streaming": {
            "string_value": "True"
        },
        "use_subword": {
            "string_value": "True"
        },
        "beam_size": {
            "string_value": "16"
        },
        "right_padding_size": {
            "string_value": "1.92"
        },
        "beam_size_token": {
            "string_value": "16"
        },
        "sil_token": {
            "string_value": "▁"
        },
        "num_tokenization": {
            "string_value": "1"
        },
        "beam_threshold": {
            "string_value": "20.0"
        },
        "language_model_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/jarvis_asr_train_datasets_noSpgi_noLS_gt_3gram.binary"
        },
        "tokenizer_model": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/498056ba420d4bb3831ad557fba06032_tokenizer.model"
        },
        "max_execution_batch_size": {
            "string_value": "1024"
        },
        "forerunner_use_lm": {
            "string_value": "true"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0829 20:04:22.696992 105 vad_library.cc:18] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming (version 1)
W0829 20:04:22.697497   119 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0829 20:04:22.697556   119 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.697623   119 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0829 20:04:22.697667   119 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.698071 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming",
    "platform": "",
    "backend": "riva_asr_vad",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 2048,
    "input": [
        {
            "name": "CLASS_LOGITS",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1,
                1025
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "SEGMENTS_START_END",
            "data_type": "TYPE_INT32",
            "dims": [
                -1,
                2
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "chunk_size": {
            "string_value": "0.16"
        },
        "vad_start_th": {
            "string_value": "0.2"
        },
        "vad_stop_th": {
            "string_value": "0.98"
        },
        "vad_type": {
            "string_value": "ctc-vad"
        },
        "vocab_file": {
            "string_value": "/data/models/citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming/1/riva_decoder_vocabulary.txt"
        },
        "ms_per_timestep": {
            "string_value": "80"
        },
        "residue_blanks_at_start": {
            "string_value": "-2"
        },
        "streaming": {
            "string_value": "True"
        },
        "use_subword": {
            "string_value": "True"
        },
        "residue_blanks_at_end": {
            "string_value": "0"
        },
        "vad_stop_history": {
            "string_value": "800"
        },
        "vad_start_history": {
            "string_value": "300"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0829 20:04:22.705331 105 feature-extractor.cc:407] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming (version 1)
W0829 20:04:22.706038   118 parameter_parser.cc:120] Parameter is_dither_seed_random could not be set from parameters
W0829 20:04:22.706099   118 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.706149   118 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0829 20:04:22.706193   118 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.706225   118 parameter_parser.cc:120] Parameter max_sequence_idle_microseconds could not be set from parameters
W0829 20:04:22.706276   118 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.706313   118 parameter_parser.cc:120] Parameter preemph_coeff could not be set from parameters
W0829 20:04:22.706364   118 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.721384 105 backend_model.cc:255] model configuration:
{
    "name": "citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",
    "platform": "",
    "backend": "riva_asr_features",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 1024,
    "input": [
        {
            "name": "AUDIO_SIGNAL",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                -1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        },
        {
            "name": "SAMPLE_RATE",
            "data_type": "TYPE_UINT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "AUDIO_FEATURES",
            "data_type": "TYPE_FP32",
            "dims": [
                80,
                -1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "AUDIO_PROCESSED",
            "data_type": "TYPE_FP32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "cuda": {
            "graphs": false,
            "busy_wait_events": false,
            "graph_spec": [],
            "output_copy_stream": true
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 1024,
            "preferred_batch_size": [
                256,
                512
            ],
            "max_queue_delay_microseconds": 1000
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "bool_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ],
        "state": []
    },
    "instance_group": [
        {
            "name": "citrinet-1024-en-US-asr-streaming-feature-extractor-streaming_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "streaming": {
            "string_value": "True"
        },
        "stddev_floor": {
            "string_value": "1e-05"
        },
        "transpose": {
            "string_value": "False"
        },
        "left_padding_size": {
            "string_value": "1.92"
        },
        "right_padding_size": {
            "string_value": "1.92"
        },
        "gain": {
            "string_value": "1.0"
        },
        "use_utterance_norm_params": {
            "string_value": "False"
        },
        "precalc_norm_time_steps": {
            "string_value": "0"
        },
        "dither": {
            "string_value": "1e-05"
        },
        "precalc_norm_params": {
            "string_value": "False"
        },
        "norm_per_feature": {
            "string_value": "True"
        },
        "mean": {
            "string_value": "-11.4412,  -9.9334,  -9.1292,  -9.0365,  -9.2804,  -9.5643,  -9.7342, -9.6925,  -9.6333,  -9.2808,  -9.1887,  -9.1422,  -9.1397,  -9.2028, -9.2749,  -9.4776,  -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734,  -9.9257,  -9.6557,  -9.1761, -9.6653,  -9.7876,  -9.7230,  -9.7792,  -9.7056,  -9.2702,  -9.4650, -9.2755,  -9.1369,  -9.1174,  -8.9197,  -8.5394,  -8.2614,  -8.1353, -8.1422,  -8.3430,  -8.6655"
        },
        "stddev": {
            "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181"
        },
        "chunk_size": {
            "string_value": "0.16"
        },
        "max_execution_batch_size": {
            "string_value": "1024"
        },
        "sample_rate": {
            "string_value": "16000"
        },
        "window_stride": {
            "string_value": "0.01"
        },
        "window_size": {
            "string_value": "0.025"
        },
        "num_features": {
            "string_value": "80"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": false
    }
}
I0829 20:04:22.722161 105 vad_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming_0 (device 0)
I0829 20:04:22.836742 105 feature-extractor.cc:409] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-feature-extractor-offline_0 (device 0)
I0829 20:04:22.844713 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:26.193126 105 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0)
I0829 20:04:26.199879 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-feature-extractor-offline' version 1
I0829 20:04:27.056668   114 ctc-decoder.cc:171] Beam Decoder initialized successfully!
I0829 20:04:27.056878 105 vad_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline_0 (device 0)
I0829 20:04:27.064029 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:27.188417 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline' version 1
I0829 20:04:27.212304 105 pipeline_library.cc:19] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0829 20:04:27.212878   120 parameter_parser.cc:120] Parameter bos could not be set from parameters
W0829 20:04:27.212966   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213021   120 parameter_parser.cc:120] Parameter dropout_prob could not be set from parameters
W0829 20:04:27.213052   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213101   120 parameter_parser.cc:120] Parameter eos could not be set from parameters
W0829 20:04:27.213138   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213184   120 parameter_parser.cc:120] Parameter reverse could not be set from parameters
W0829 20:04:27.213222   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213304   120 parameter_parser.cc:120] Parameter bos could not be set from parameters
W0829 20:04:27.213346   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213378   120 parameter_parser.cc:120] Parameter doc_stride could not be set from parameters
W0829 20:04:27.213426   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213460   120 parameter_parser.cc:120] Parameter dropout_prob could not be set from parameters
W0829 20:04:27.213513   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213562   120 parameter_parser.cc:120] Parameter eos could not be set from parameters
W0829 20:04:27.213593   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213649   120 parameter_parser.cc:120] Parameter margin could not be set from parameters
W0829 20:04:27.213680   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213728   120 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0829 20:04:27.213762   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213804   120 parameter_parser.cc:120] Parameter max_query_length could not be set from parameters
W0829 20:04:27.213842   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213873   120 parameter_parser.cc:120] Parameter max_seq_length could not be set from parameters
W0829 20:04:27.213920   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213956   120 parameter_parser.cc:120] Parameter reverse could not be set from parameters
W0829 20:04:27.213999   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.214037   120 parameter_parser.cc:120] Parameter step could not be set from parameters
W0829 20:04:27.214068   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.214116   120 parameter_parser.cc:120] Parameter task could not be set from parameters
W0829 20:04:27.214149   120 parameter_parser.cc:121] Default value will be used
I0829 20:04:27.214242 105 backend_model.cc:255] model configuration:
{
    "name": "riva-punctuation-en-US",
    "platform": "",
    "backend": "riva_nlp_pipeline",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 8,
    "input": [
        {
            "name": "PIPELINE_INPUT",
            "data_type": "TYPE_STRING",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false,
            "optional": false
        }
    ],
    "output": [
        {
            "name": "PIPELINE_OUTPUT",
            "data_type": "TYPE_STRING",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "instance_group": [
        {
            "name": "riva-punctuation-en-US_0",
            "kind": "KIND_CPU",
            "count": 1,
            "gpus": [],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "punct_logits_tensor_name": {
            "string_value": "punct_token_logits"
        },
        "language_code": {
            "string_value": "en-US"
        },
        "tokenizer": {
            "string_value": "wordpiece"
        },
        "delimiter": {
            "string_value": " "
        },
        "input_ids_tensor_name": {
            "string_value": "input_ids"
        },
        "model_name": {
            "string_value": "riva-trt-riva-punctuation-en-US-nn-bert-base-uncased"
        },
        "pad_chars_with_spaces": {
            "string_value": "False"
        },
        "remove_spaces": {
            "string_value": "False"
        },
        "tokenizer_to_lower": {
            "string_value": "true"
        },
        "model_family": {
            "string_value": "riva"
        },
        "unk_token": {
            "string_value": "[UNK]"
        },
        "vocab": {
            "string_value": "/data/models/riva-punctuation-en-US/1/tokenizer.vocab_file"
        },
        "bos_token": {
            "string_value": "[CLS]"
        },
        "capit_logits_tensor_name": {
            "string_value": "capit_token_logits"
        },
        "punctuation_mapping_path": {
            "string_value": "/data/models/riva-punctuation-en-US/1/punct_label_ids.csv"
        },
        "model_api": {
            "string_value": "/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText"
        },
        "pipeline_type": {
            "string_value": "punctuation"
        },
        "to_lower": {
            "string_value": "true"
        },
        "eos_token": {
            "string_value": "[SEP]"
        },
        "capitalization_mapping_path": {
            "string_value": "/data/models/riva-punctuation-en-US/1/capit_label_ids.csv"
        },
        "load_model": {
            "string_value": "false"
        },
        "attn_mask_tensor_name": {
            "string_value": "input_mask"
        },
        "token_type_tensor_name": {
            "string_value": "segment_ids"
        }
    },
    "model_warmup": []
}
I0829 20:04:27.214900 105 pipeline_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0)
I0829 20:04:27.225177 105 feature-extractor.cc:409] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming_0 (device 0)
I0829 20:04:27.231758 105 model_repository_manager.cc:1149] successfully loaded 'riva-punctuation-en-US' version 1
I0829 20:04:27.509611 105 tensorrt.cc:5145] TRITONBACKEND_Initialize: tensorrt
I0829 20:04:27.509821 105 tensorrt.cc:5155] Triton TRITONBACKEND API version: 1.8
I0829 20:04:27.509900 105 tensorrt.cc:5161] 'tensorrt' TRITONBACKEND API version: 1.8
I0829 20:04:27.510119 105 tensorrt.cc:5204] backend configuration:
{}
I0829 20:04:27.510242 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming (version 1)
I0829 20:04:27.510878 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased (version 1)
I0829 20:04:27.511426 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 (GPU device 0)
I0829 20:04:27.520126 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-feature-extractor-streaming' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:29.346582 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +417, GPU +0, now: CPU 2067, GPU 3606 (MiB)
I0829 20:04:29.597220 105 logging.cc:49] Loaded engine size: 208 MiB
I0829 20:04:29.764864 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2598, GPU 3948 (MiB)
I0829 20:04:30.001373 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +126, GPU +58, now: CPU 2724, GPU 4006 (MiB)
I0829 20:04:30.005638 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +94, now: CPU 0, GPU 94 (MiB)
I0829 20:04:30.028823 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2308, GPU 3998 (MiB)
I0829 20:04:30.029769 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2308, GPU 4006 (MiB)
I0829 20:04:30.119911 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +108, now: CPU 0, GPU 202 (MiB)
I0829 20:04:30.120289 105 tensorrt.cc:1409] Created instance riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0829 20:04:30.120401 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-citrinet-1024-en-US-asr-offline-am-offline (version 1)
I0829 20:04:30.121026 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-citrinet-1024-en-US-asr-offline-am-offline_0 (GPU device 0)
I0829 20:04:30.121565 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2421, GPU 4258 (MiB)
I0829 20:04:30.127830 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-riva-punctuation-en-US-nn-bert-base-uncased' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:30.461346 105 logging.cc:49] Loaded engine size: 283 MiB
I0829 20:04:30.744502 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2997, GPU 4550 (MiB)
I0829 20:04:30.745890 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 2997, GPU 4560 (MiB)
I0829 20:04:30.748526 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +281, now: CPU 0, GPU 483 (MiB)
I0829 20:04:30.779644 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2430, GPU 4552 (MiB)
I0829 20:04:30.780572 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2430, GPU 4560 (MiB)
I0829 20:04:30.816233 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +565, now: CPU 0, GPU 1048 (MiB)
I0829 20:04:30.817876 105 tensorrt.cc:1409] Created instance riva-trt-citrinet-1024-en-US-asr-offline-am-offline_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0829 20:04:30.818022 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming_0 (GPU device 0)
I0829 20:04:30.818513 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2439, GPU 5178 (MiB)
I0829 20:04:30.823808 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-citrinet-1024-en-US-asr-offline-am-offline' version 1
I0829 20:04:31.156220 105 logging.cc:49] Loaded engine size: 277 MiB
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:31.447480 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3003, GPU 5464 (MiB)
I0829 20:04:31.448846 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +1, GPU +10, now: CPU 3004, GPU 5474 (MiB)
I0829 20:04:31.451456 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +1, GPU +275, now: CPU 1, GPU 1323 (MiB)
I0829 20:04:31.482080 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2448, GPU 5466 (MiB)
I0829 20:04:31.483059 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2448, GPU 5474 (MiB)
I0829 20:04:31.490937 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +532, now: CPU 1, GPU 1855 (MiB)
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:34.159140 105 tensorrt.cc:1409] Created instance riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0829 20:04:34.159613 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming' version 1
I0829 20:04:34.160377 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline:1
I0829 20:04:34.260745 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming:1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:34.360959 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline' version 1
I0829 20:04:34.361246 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming' version 1
I0829 20:04:34.361409 105 server.cc:522] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0829 20:04:34.361562 105 server.cc:549] 
+-------------------+-----------------------------------------------------------------------------+--------+
| Backend           | Path                                                                        | Config |
+-------------------+-----------------------------------------------------------------------------+--------+
| onnxruntime       | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so             | {}     |
| riva_asr_decoder  | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so   | {}     |
| tensorrt          | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so                   | {}     |
| riva_asr_vad      | /opt/tritonserver/backends/riva_asr_vad/libtriton_riva_asr_vad.so           | {}     |
| riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {}     |
| riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {}     |
+-------------------+-----------------------------------------------------------------------------+--------+

I0829 20:04:34.361768 105 server.cc:592] 
+-------------------------------------------------------------------------+---------+--------+
| Model                                                                   | Version | Status |
+-------------------------------------------------------------------------+---------+--------+
| citrinet-1024-en-US-asr-offline                                         | 1       | READY  |
| citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline                 | 1       | READY  |
| citrinet-1024-en-US-asr-offline-feature-extractor-offline               | 1       | READY  |
| citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline     | 1       | READY  |
| citrinet-1024-en-US-asr-streaming                                       | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming             | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-feature-extractor-streaming           | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming | 1       | READY  |
| riva-punctuation-en-US                                                  | 1       | READY  |
| riva-trt-citrinet-1024-en-US-asr-offline-am-offline                     | 1       | READY  |
| riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming                 | 1       | READY  |
| riva-trt-riva-punctuation-en-US-nn-bert-base-uncased                    | 1       | READY  |
+-------------------------------------------------------------------------+---------+--------+

I0829 20:04:34.375394 105 metrics.cc:623] Collecting metrics for GPU 0: GRID A100D-8C
I0829 20:04:34.375790 105 tritonserver.cc:1932] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                        |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                       |
| server_version                   | 2.19.0                                                                                                                                                                                       |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0]         | /data/models                                                                                                                                                                                 |
| model_control_mode               | MODE_NONE                                                                                                                                                                                    |
| strict_model_config              | 1                                                                                                                                                                                            |
| rate_limit                       | OFF                                                                                                                                                                                          |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
| cuda_memory_pool_byte_size{0}    | 1000000000                                                                                                                                                                                   |
| response_cache_byte_size         | 0                                                                                                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                                                                                          |
| strict_readiness                 | 1                                                                                                                                                                                            |
| exit_timeout                     | 30                                                                                                                                                                                           |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0829 20:04:34.376740 105 grpc_server.cc:4375] Started GRPCInferenceService at 0.0.0.0:8001
I0829 20:04:34.377056 105 http_server.cc:3075] Started HTTPService at 0.0.0.0:8000
I0829 20:04:34.418180 105 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
  > Triton server is ready...
I0829 20:04:35.298982   267 riva_server.cc:118] Using Insecure Server Credentials
I0829 20:04:35.302084   267 model_registry.cc:112] Successfully registered: citrinet-1024-en-US-asr-offline for ASR
I0829 20:04:35.305709   267 model_registry.cc:112] Successfully registered: citrinet-1024-en-US-asr-streaming for ASR
I0829 20:04:35.327199   267 model_registry.cc:112] Successfully registered: riva-punctuation-en-US for NLP
W0829 20:04:35.377905 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0829 20:04:35.378040 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0829 20:04:35.378113 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
I0829 20:04:35.397414   267 riva_server.cc:158] Riva Conversational AI Server listening on 0.0.0.0:50051
W0829 20:04:35.397500   267 stats_reporter.cc:41] No API key provided. Stats reporting disabled.
W0829 20:04:36.378317 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0829 20:04:36.378600 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0829 20:04:36.378641 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0829 20:04:37.380072 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0829 20:04:37.380270 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0829 20:04:37.380347 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0

NSDB · August 29, 2022, 8:18pm

PART 2/2:

…and then, further:

#  bash riva_stop.sh
# bash riva_start_client.sh
# > riva_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav

# bash riva_start_client.sh

Image nvcr.io/nvidia/riva/riva-speech-client:2.3.0 exists.
Skipping pull.

root@tengine:/work# riva_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav

I0829 20:13:56.632402 10 riva_asr_client.cc:434] Using Insecure Server Credentials

Error creating GRPC channel: Unable to establish connection to server. Current state: 3

Exiting.

root@tengine:/work#

…and resulting logs:

# docker logs riva-speech

==========================
=== Riva Speech Skills ===
==========================

NVIDIA Release 22.06 (build 40051835)
Riva Speech Server Version 2.3.0

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for Riva Speech Server.  NVIDIA recommends the use of the following flags:
   docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...

  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:20.996861 105 onnxruntime.cc:2319] TRITONBACKEND_Initialize: onnxruntime
I0829 20:04:20.997597 105 onnxruntime.cc:2329] Triton TRITONBACKEND API version: 1.8
I0829 20:04:20.997681 105 onnxruntime.cc:2335] 'onnxruntime' TRITONBACKEND API version: 1.8
I0829 20:04:20.997749 105 onnxruntime.cc:2365] backend configuration:
{}
I0829 20:04:21.511777 105 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x10020000000' with size 268435456
I0829 20:04:21.512156 105 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000
I0829 20:04:21.517792 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline:1
I0829 20:04:21.618142 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-feature-extractor-offline:1
I0829 20:04:21.655977 105 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0829 20:04:21.657289   111 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0829 20:04:21.657415   111 parameter_parser.cc:121] Default value will be used
W0829 20:04:21.657531   111 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0829 20:04:21.657573   111 parameter_parser.cc:121] Default value will be used
W0829 20:04:21.657608   111 parameter_parser.cc:120] Parameter max_num_slots could not be set from parameters
W0829 20:04:21.657655   111 parameter_parser.cc:121] Default value will be used
I0829 20:04:21.658139 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",
	"platform": "",
	"backend": "riva_asr_decoder",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 128,
	"input": [
		{
			"name": "CLASS_LOGITS",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				1025
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "END_FLAG",
			"data_type": "TYPE_UINT32",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "SEGMENTS_START_END",
			"data_type": "TYPE_INT32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				2
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "CUSTOM_CONFIGURATION",
			"data_type": "TYPE_STRING",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				2
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "FINAL_TRANSCRIPTS",
			"data_type": "TYPE_STRING",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "FINAL_TRANSCRIPTS_SCORE",
			"data_type": "TYPE_FP32",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "FINAL_WORDS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_TRANSCRIPTS",
			"data_type": "TYPE_STRING",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_TRANSCRIPTS_STABILITY",
			"data_type": "TYPE_FP32",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_WORDS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"oldest": {
			"max_candidate_sequences": 128,
			"preferred_batch_size": [
				32,
				64
			],
			"max_queue_delay_microseconds": 1000
		},
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "END",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_END",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "CORRID",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_CORRID",
						"int32_false_true": [],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_UINT64"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"forerunner_beam_size_token": {
			"string_value": "8"
		},
		"forerunner_beam_threshold": {
			"string_value": "10.0"
		},
		"decoder_num_worker_threads": {
			"string_value": "-1"
		},
		"asr_model_delay": {
			"string_value": "-1"
		},
		"word_insertion_score": {
			"string_value": "0.2"
		},
		"left_padding_size": {
			"string_value": "0.0"
		},
		"decoder_type": {
			"string_value": "flashlight"
		},
		"forerunner_beam_size": {
			"string_value": "8"
		},
		"max_supported_transcripts": {
			"string_value": "1"
		},
		"chunk_size": {
			"string_value": "300.0"
		},
		"lexicon_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/lexicon.txt"
		},
		"smearing_mode": {
			"string_value": "max"
		},
		"use_vad": {
			"string_value": "True"
		},
		"lm_weight": {
			"string_value": "0.2"
		},
		"blank_token": {
			"string_value": "#"
		},
		"vocab_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/riva_decoder_vocabulary.txt"
		},
		"ms_per_timestep": {
			"string_value": "80"
		},
		"streaming": {
			"string_value": "False"
		},
		"use_subword": {
			"string_value": "True"
		},
		"beam_size": {
			"string_value": "16"
		},
		"right_padding_size": {
			"string_value": "0.0"
		},
		"beam_size_token": {
			"string_value": "16"
		},
		"sil_token": {
			"string_value": "▁"
		},
		"num_tokenization": {
			"string_value": "1"
		},
		"beam_threshold": {
			"string_value": "20.0"
		},
		"language_model_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/jarvis_asr_train_datasets_noSpgi_noLS_gt_3gram.binary"
		},
		"tokenizer_model": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline/1/498056ba420d4bb3831ad557fba06032_tokenizer.model"
		},
		"max_execution_batch_size": {
			"string_value": "1024"
		},
		"forerunner_use_lm": {
			"string_value": "true"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0829 20:04:21.659291 105 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline_0 (device 0)
I0829 20:04:21.718440 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline:1
I0829 20:04:21.818839 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming:1
I0829 20:04:21.921363 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming:1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:22.021788 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming:1
I0829 20:04:22.122154 105 model_repository_manager.cc:994] loading: riva-punctuation-en-US:1
I0829 20:04:22.222552 105 model_repository_manager.cc:994] loading: riva-trt-citrinet-1024-en-US-asr-offline-am-offline:1
I0829 20:04:22.322946 105 model_repository_manager.cc:994] loading: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming:1
I0829 20:04:22.423352 105 model_repository_manager.cc:994] loading: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased:1
I0829 20:04:22.539752   111 ctc-decoder.cc:171] Beam Decoder initialized successfully!
I0829 20:04:22.540524 105 feature-extractor.cc:407] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-feature-extractor-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0829 20:04:22.541316   112 parameter_parser.cc:120] Parameter is_dither_seed_random could not be set from parameters
W0829 20:04:22.541436   112 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.541481   112 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0829 20:04:22.541538   112 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.541584   112 parameter_parser.cc:120] Parameter max_sequence_idle_microseconds could not be set from parameters
W0829 20:04:22.541615   112 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.541668   112 parameter_parser.cc:120] Parameter preemph_coeff could not be set from parameters
W0829 20:04:22.541703   112 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.547923 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline' version 1
I0829 20:04:22.691245 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-offline-feature-extractor-offline",
	"platform": "",
	"backend": "riva_asr_features",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 1,
	"input": [
		{
			"name": "AUDIO_SIGNAL",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "SAMPLE_RATE",
			"data_type": "TYPE_UINT32",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "AUDIO_FEATURES",
			"data_type": "TYPE_FP32",
			"dims": [
				80,
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "AUDIO_PROCESSED",
			"data_type": "TYPE_FP32",
			"dims": [
				1
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"oldest": {
			"max_candidate_sequences": 1,
			"preferred_batch_size": [
				1
			],
			"max_queue_delay_microseconds": 1000
		},
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "END",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_END",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "CORRID",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_CORRID",
						"int32_false_true": [],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_UINT64"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-offline-feature-extractor-offline_0",
			"kind": "KIND_GPU",
			"count": 1,
			"gpus": [
				0
			],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"gain": {
			"string_value": "1.0"
		},
		"use_utterance_norm_params": {
			"string_value": "False"
		},
		"precalc_norm_time_steps": {
			"string_value": "0"
		},
		"precalc_norm_params": {
			"string_value": "False"
		},
		"dither": {
			"string_value": "0.0"
		},
		"norm_per_feature": {
			"string_value": "True"
		},
		"mean": {
			"string_value": "-11.4412,  -9.9334,  -9.1292,  -9.0365,  -9.2804,  -9.5643,  -9.7342, -9.6925,  -9.6333,  -9.2808,  -9.1887,  -9.1422,  -9.1397,  -9.2028, -9.2749,  -9.4776,  -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734,  -9.9257,  -9.6557,  -9.1761, -9.6653,  -9.7876,  -9.7230,  -9.7792,  -9.7056,  -9.2702,  -9.4650, -9.2755,  -9.1369,  -9.1174,  -8.9197,  -8.5394,  -8.2614,  -8.1353, -8.1422,  -8.3430,  -8.6655"
		},
		"stddev": {
			"string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181"
		},
		"chunk_size": {
			"string_value": "300.0"
		},
		"max_execution_batch_size": {
			"string_value": "1"
		},
		"sample_rate": {
			"string_value": "16000"
		},
		"window_stride": {
			"string_value": "0.01"
		},
		"window_size": {
			"string_value": "0.025"
		},
		"num_features": {
			"string_value": "80"
		},
		"streaming": {
			"string_value": "False"
		},
		"left_padding_size": {
			"string_value": "0.0"
		},
		"stddev_floor": {
			"string_value": "1e-05"
		},
		"transpose": {
			"string_value": "False"
		},
		"right_padding_size": {
			"string_value": "0.0"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0829 20:04:22.692584 105 vad_library.cc:18] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0829 20:04:22.693289   113 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0829 20:04:22.693377   113 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.693456   113 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0829 20:04:22.693511   113 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.693940 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline",
	"platform": "",
	"backend": "riva_asr_vad",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 2048,
	"input": [
		{
			"name": "CLASS_LOGITS",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				1025
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "SEGMENTS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"chunk_size": {
			"string_value": "300.0"
		},
		"vad_start_th": {
			"string_value": "0.2"
		},
		"vad_stop_th": {
			"string_value": "0.98"
		},
		"vad_type": {
			"string_value": "ctc-vad"
		},
		"vocab_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline/1/riva_decoder_vocabulary.txt"
		},
		"residue_blanks_at_start": {
			"string_value": "0"
		},
		"ms_per_timestep": {
			"string_value": "80"
		},
		"streaming": {
			"string_value": "False"
		},
		"use_subword": {
			"string_value": "True"
		},
		"residue_blanks_at_end": {
			"string_value": "0"
		},
		"vad_stop_history": {
			"string_value": "800"
		},
		"vad_start_history": {
			"string_value": "300"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0829 20:04:22.694467 105 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1)
W0829 20:04:22.695189   114 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0829 20:04:22.695247   114 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.695338   114 parameter_parser.cc:120] Parameter forerunner_start_offset_ms could not be set from parameters
W0829 20:04:22.695382   114 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.695416   114 parameter_parser.cc:120] Parameter max_num_slots could not be set from parameters
W0829 20:04:22.695467   114 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.695963 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",
	"platform": "",
	"backend": "riva_asr_decoder",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 1024,
	"input": [
		{
			"name": "CLASS_LOGITS",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				1025
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "END_FLAG",
			"data_type": "TYPE_UINT32",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "SEGMENTS_START_END",
			"data_type": "TYPE_INT32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				2
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "CUSTOM_CONFIGURATION",
			"data_type": "TYPE_STRING",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				2
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "FINAL_TRANSCRIPTS",
			"data_type": "TYPE_STRING",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "FINAL_TRANSCRIPTS_SCORE",
			"data_type": "TYPE_FP32",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "FINAL_WORDS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_TRANSCRIPTS",
			"data_type": "TYPE_STRING",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_TRANSCRIPTS_STABILITY",
			"data_type": "TYPE_FP32",
			"dims": [
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "PARTIAL_WORDS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"oldest": {
			"max_candidate_sequences": 1024,
			"preferred_batch_size": [
				32,
				64
			],
			"max_queue_delay_microseconds": 1000
		},
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "END",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_END",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "CORRID",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_CORRID",
						"int32_false_true": [],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_UINT64"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"forerunner_beam_size_token": {
			"string_value": "8"
		},
		"forerunner_beam_threshold": {
			"string_value": "10.0"
		},
		"asr_model_delay": {
			"string_value": "-1"
		},
		"decoder_num_worker_threads": {
			"string_value": "-1"
		},
		"word_insertion_score": {
			"string_value": "0.2"
		},
		"left_padding_size": {
			"string_value": "1.92"
		},
		"decoder_type": {
			"string_value": "flashlight"
		},
		"forerunner_beam_size": {
			"string_value": "8"
		},
		"chunk_size": {
			"string_value": "0.16"
		},
		"max_supported_transcripts": {
			"string_value": "1"
		},
		"lexicon_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt"
		},
		"smearing_mode": {
			"string_value": "max"
		},
		"use_vad": {
			"string_value": "True"
		},
		"lm_weight": {
			"string_value": "0.2"
		},
		"blank_token": {
			"string_value": "#"
		},
		"vocab_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt"
		},
		"ms_per_timestep": {
			"string_value": "80"
		},
		"streaming": {
			"string_value": "True"
		},
		"use_subword": {
			"string_value": "True"
		},
		"beam_size": {
			"string_value": "16"
		},
		"right_padding_size": {
			"string_value": "1.92"
		},
		"beam_size_token": {
			"string_value": "16"
		},
		"sil_token": {
			"string_value": "▁"
		},
		"num_tokenization": {
			"string_value": "1"
		},
		"beam_threshold": {
			"string_value": "20.0"
		},
		"language_model_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/jarvis_asr_train_datasets_noSpgi_noLS_gt_3gram.binary"
		},
		"tokenizer_model": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/498056ba420d4bb3831ad557fba06032_tokenizer.model"
		},
		"max_execution_batch_size": {
			"string_value": "1024"
		},
		"forerunner_use_lm": {
			"string_value": "true"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0829 20:04:22.696992 105 vad_library.cc:18] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming (version 1)
W0829 20:04:22.697497   119 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0829 20:04:22.697556   119 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.697623   119 parameter_parser.cc:120] Parameter max_execution_batch_size could not be set from parameters
W0829 20:04:22.697667   119 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.698071 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming",
	"platform": "",
	"backend": "riva_asr_vad",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 2048,
	"input": [
		{
			"name": "CLASS_LOGITS",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1,
				1025
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "SEGMENTS_START_END",
			"data_type": "TYPE_INT32",
			"dims": [
				-1,
				2
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"chunk_size": {
			"string_value": "0.16"
		},
		"vad_start_th": {
			"string_value": "0.2"
		},
		"vad_stop_th": {
			"string_value": "0.98"
		},
		"vad_type": {
			"string_value": "ctc-vad"
		},
		"vocab_file": {
			"string_value": "/data/models/citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming/1/riva_decoder_vocabulary.txt"
		},
		"ms_per_timestep": {
			"string_value": "80"
		},
		"residue_blanks_at_start": {
			"string_value": "-2"
		},
		"streaming": {
			"string_value": "True"
		},
		"use_subword": {
			"string_value": "True"
		},
		"residue_blanks_at_end": {
			"string_value": "0"
		},
		"vad_stop_history": {
			"string_value": "800"
		},
		"vad_start_history": {
			"string_value": "300"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0829 20:04:22.705331 105 feature-extractor.cc:407] TRITONBACKEND_ModelInitialize: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming (version 1)
W0829 20:04:22.706038   118 parameter_parser.cc:120] Parameter is_dither_seed_random could not be set from parameters
W0829 20:04:22.706099   118 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.706149   118 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0829 20:04:22.706193   118 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.706225   118 parameter_parser.cc:120] Parameter max_sequence_idle_microseconds could not be set from parameters
W0829 20:04:22.706276   118 parameter_parser.cc:121] Default value will be used
W0829 20:04:22.706313   118 parameter_parser.cc:120] Parameter preemph_coeff could not be set from parameters
W0829 20:04:22.706364   118 parameter_parser.cc:121] Default value will be used
I0829 20:04:22.721384 105 backend_model.cc:255] model configuration:
{
	"name": "citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",
	"platform": "",
	"backend": "riva_asr_features",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 1024,
	"input": [
		{
			"name": "AUDIO_SIGNAL",
			"data_type": "TYPE_FP32",
			"format": "FORMAT_NONE",
			"dims": [
				-1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		},
		{
			"name": "SAMPLE_RATE",
			"data_type": "TYPE_UINT32",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "AUDIO_FEATURES",
			"data_type": "TYPE_FP32",
			"dims": [
				80,
				-1
			],
			"label_filename": "",
			"is_shape_tensor": false
		},
		{
			"name": "AUDIO_PROCESSED",
			"data_type": "TYPE_FP32",
			"dims": [
				1
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"cuda": {
			"graphs": false,
			"busy_wait_events": false,
			"graph_spec": [],
			"output_copy_stream": true
		},
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"sequence_batching": {
		"oldest": {
			"max_candidate_sequences": 1024,
			"preferred_batch_size": [
				256,
				512
			],
			"max_queue_delay_microseconds": 1000
		},
		"max_sequence_idle_microseconds": 60000000,
		"control_input": [
			{
				"name": "START",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_START",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "READY",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_READY",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "END",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_END",
						"int32_false_true": [
							0,
							1
						],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_INVALID"
					}
				]
			},
			{
				"name": "CORRID",
				"control": [
					{
						"kind": "CONTROL_SEQUENCE_CORRID",
						"int32_false_true": [],
						"fp32_false_true": [],
						"bool_false_true": [],
						"data_type": "TYPE_UINT64"
					}
				]
			}
		],
		"state": []
	},
	"instance_group": [
		{
			"name": "citrinet-1024-en-US-asr-streaming-feature-extractor-streaming_0",
			"kind": "KIND_GPU",
			"count": 1,
			"gpus": [
				0
			],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"streaming": {
			"string_value": "True"
		},
		"stddev_floor": {
			"string_value": "1e-05"
		},
		"transpose": {
			"string_value": "False"
		},
		"left_padding_size": {
			"string_value": "1.92"
		},
		"right_padding_size": {
			"string_value": "1.92"
		},
		"gain": {
			"string_value": "1.0"
		},
		"use_utterance_norm_params": {
			"string_value": "False"
		},
		"precalc_norm_time_steps": {
			"string_value": "0"
		},
		"dither": {
			"string_value": "1e-05"
		},
		"precalc_norm_params": {
			"string_value": "False"
		},
		"norm_per_feature": {
			"string_value": "True"
		},
		"mean": {
			"string_value": "-11.4412,  -9.9334,  -9.1292,  -9.0365,  -9.2804,  -9.5643,  -9.7342, -9.6925,  -9.6333,  -9.2808,  -9.1887,  -9.1422,  -9.1397,  -9.2028, -9.2749,  -9.4776,  -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734,  -9.9257,  -9.6557,  -9.1761, -9.6653,  -9.7876,  -9.7230,  -9.7792,  -9.7056,  -9.2702,  -9.4650, -9.2755,  -9.1369,  -9.1174,  -8.9197,  -8.5394,  -8.2614,  -8.1353, -8.1422,  -8.3430,  -8.6655"
		},
		"stddev": {
			"string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181"
		},
		"chunk_size": {
			"string_value": "0.16"
		},
		"max_execution_batch_size": {
			"string_value": "1024"
		},
		"sample_rate": {
			"string_value": "16000"
		},
		"window_stride": {
			"string_value": "0.01"
		},
		"window_size": {
			"string_value": "0.025"
		},
		"num_features": {
			"string_value": "80"
		}
	},
	"model_warmup": [],
	"model_transaction_policy": {
		"decoupled": false
	}
}
I0829 20:04:22.722161 105 vad_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming_0 (device 0)
I0829 20:04:22.836742 105 feature-extractor.cc:409] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-feature-extractor-offline_0 (device 0)
I0829 20:04:22.844713 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:26.193126 105 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0)
I0829 20:04:26.199879 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-feature-extractor-offline' version 1
I0829 20:04:27.056668   114 ctc-decoder.cc:171] Beam Decoder initialized successfully!
I0829 20:04:27.056878 105 vad_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline_0 (device 0)
I0829 20:04:27.064029 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:27.188417 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline' version 1
I0829 20:04:27.212304 105 pipeline_library.cc:19] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0829 20:04:27.212878   120 parameter_parser.cc:120] Parameter bos could not be set from parameters
W0829 20:04:27.212966   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213021   120 parameter_parser.cc:120] Parameter dropout_prob could not be set from parameters
W0829 20:04:27.213052   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213101   120 parameter_parser.cc:120] Parameter eos could not be set from parameters
W0829 20:04:27.213138   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213184   120 parameter_parser.cc:120] Parameter reverse could not be set from parameters
W0829 20:04:27.213222   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213304   120 parameter_parser.cc:120] Parameter bos could not be set from parameters
W0829 20:04:27.213346   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213378   120 parameter_parser.cc:120] Parameter doc_stride could not be set from parameters
W0829 20:04:27.213426   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213460   120 parameter_parser.cc:120] Parameter dropout_prob could not be set from parameters
W0829 20:04:27.213513   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213562   120 parameter_parser.cc:120] Parameter eos could not be set from parameters
W0829 20:04:27.213593   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213649   120 parameter_parser.cc:120] Parameter margin could not be set from parameters
W0829 20:04:27.213680   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213728   120 parameter_parser.cc:120] Parameter max_batch_size could not be set from parameters
W0829 20:04:27.213762   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213804   120 parameter_parser.cc:120] Parameter max_query_length could not be set from parameters
W0829 20:04:27.213842   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213873   120 parameter_parser.cc:120] Parameter max_seq_length could not be set from parameters
W0829 20:04:27.213920   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.213956   120 parameter_parser.cc:120] Parameter reverse could not be set from parameters
W0829 20:04:27.213999   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.214037   120 parameter_parser.cc:120] Parameter step could not be set from parameters
W0829 20:04:27.214068   120 parameter_parser.cc:121] Default value will be used
W0829 20:04:27.214116   120 parameter_parser.cc:120] Parameter task could not be set from parameters
W0829 20:04:27.214149   120 parameter_parser.cc:121] Default value will be used
I0829 20:04:27.214242 105 backend_model.cc:255] model configuration:
{
	"name": "riva-punctuation-en-US",
	"platform": "",
	"backend": "riva_nlp_pipeline",
	"version_policy": {
		"latest": {
			"num_versions": 1
		}
	},
	"max_batch_size": 8,
	"input": [
		{
			"name": "PIPELINE_INPUT",
			"data_type": "TYPE_STRING",
			"format": "FORMAT_NONE",
			"dims": [
				1
			],
			"is_shape_tensor": false,
			"allow_ragged_batch": false,
			"optional": false
		}
	],
	"output": [
		{
			"name": "PIPELINE_OUTPUT",
			"data_type": "TYPE_STRING",
			"dims": [
				1
			],
			"label_filename": "",
			"is_shape_tensor": false
		}
	],
	"batch_input": [],
	"batch_output": [],
	"optimization": {
		"priority": "PRIORITY_DEFAULT",
		"input_pinned_memory": {
			"enable": true
		},
		"output_pinned_memory": {
			"enable": true
		},
		"gather_kernel_buffer_threshold": 0,
		"eager_batching": false
	},
	"instance_group": [
		{
			"name": "riva-punctuation-en-US_0",
			"kind": "KIND_CPU",
			"count": 1,
			"gpus": [],
			"secondary_devices": [],
			"profile": [],
			"passive": false,
			"host_policy": ""
		}
	],
	"default_model_filename": "",
	"cc_model_filenames": {},
	"metric_tags": {},
	"parameters": {
		"punct_logits_tensor_name": {
			"string_value": "punct_token_logits"
		},
		"language_code": {
			"string_value": "en-US"
		},
		"tokenizer": {
			"string_value": "wordpiece"
		},
		"delimiter": {
			"string_value": " "
		},
		"input_ids_tensor_name": {
			"string_value": "input_ids"
		},
		"model_name": {
			"string_value": "riva-trt-riva-punctuation-en-US-nn-bert-base-uncased"
		},
		"pad_chars_with_spaces": {
			"string_value": "False"
		},
		"remove_spaces": {
			"string_value": "False"
		},
		"tokenizer_to_lower": {
			"string_value": "true"
		},
		"model_family": {
			"string_value": "riva"
		},
		"unk_token": {
			"string_value": "[UNK]"
		},
		"vocab": {
			"string_value": "/data/models/riva-punctuation-en-US/1/tokenizer.vocab_file"
		},
		"bos_token": {
			"string_value": "[CLS]"
		},
		"capit_logits_tensor_name": {
			"string_value": "capit_token_logits"
		},
		"punctuation_mapping_path": {
			"string_value": "/data/models/riva-punctuation-en-US/1/punct_label_ids.csv"
		},
		"model_api": {
			"string_value": "/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText"
		},
		"pipeline_type": {
			"string_value": "punctuation"
		},
		"to_lower": {
			"string_value": "true"
		},
		"eos_token": {
			"string_value": "[SEP]"
		},
		"capitalization_mapping_path": {
			"string_value": "/data/models/riva-punctuation-en-US/1/capit_label_ids.csv"
		},
		"load_model": {
			"string_value": "false"
		},
		"attn_mask_tensor_name": {
			"string_value": "input_mask"
		},
		"token_type_tensor_name": {
			"string_value": "segment_ids"
		}
	},
	"model_warmup": []
}
I0829 20:04:27.214900 105 pipeline_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0)
I0829 20:04:27.225177 105 feature-extractor.cc:409] TRITONBACKEND_ModelInstanceInitialize: citrinet-1024-en-US-asr-streaming-feature-extractor-streaming_0 (device 0)
I0829 20:04:27.231758 105 model_repository_manager.cc:1149] successfully loaded 'riva-punctuation-en-US' version 1
I0829 20:04:27.509611 105 tensorrt.cc:5145] TRITONBACKEND_Initialize: tensorrt
I0829 20:04:27.509821 105 tensorrt.cc:5155] Triton TRITONBACKEND API version: 1.8
I0829 20:04:27.509900 105 tensorrt.cc:5161] 'tensorrt' TRITONBACKEND API version: 1.8
I0829 20:04:27.510119 105 tensorrt.cc:5204] backend configuration:
{}
I0829 20:04:27.510242 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming (version 1)
I0829 20:04:27.510878 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased (version 1)
I0829 20:04:27.511426 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 (GPU device 0)
I0829 20:04:27.520126 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming-feature-extractor-streaming' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:29.346582 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +417, GPU +0, now: CPU 2067, GPU 3606 (MiB)
I0829 20:04:29.597220 105 logging.cc:49] Loaded engine size: 208 MiB
I0829 20:04:29.764864 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2598, GPU 3948 (MiB)
I0829 20:04:30.001373 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +126, GPU +58, now: CPU 2724, GPU 4006 (MiB)
I0829 20:04:30.005638 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +94, now: CPU 0, GPU 94 (MiB)
I0829 20:04:30.028823 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2308, GPU 3998 (MiB)
I0829 20:04:30.029769 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2308, GPU 4006 (MiB)
I0829 20:04:30.119911 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +108, now: CPU 0, GPU 202 (MiB)
I0829 20:04:30.120289 105 tensorrt.cc:1409] Created instance riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0829 20:04:30.120401 105 tensorrt.cc:5256] TRITONBACKEND_ModelInitialize: riva-trt-citrinet-1024-en-US-asr-offline-am-offline (version 1)
I0829 20:04:30.121026 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-citrinet-1024-en-US-asr-offline-am-offline_0 (GPU device 0)
I0829 20:04:30.121565 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2421, GPU 4258 (MiB)
I0829 20:04:30.127830 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-riva-punctuation-en-US-nn-bert-base-uncased' version 1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:30.461346 105 logging.cc:49] Loaded engine size: 283 MiB
I0829 20:04:30.744502 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2997, GPU 4550 (MiB)
I0829 20:04:30.745890 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 2997, GPU 4560 (MiB)
I0829 20:04:30.748526 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +281, now: CPU 0, GPU 483 (MiB)
I0829 20:04:30.779644 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2430, GPU 4552 (MiB)
I0829 20:04:30.780572 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2430, GPU 4560 (MiB)
I0829 20:04:30.816233 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +565, now: CPU 0, GPU 1048 (MiB)
I0829 20:04:30.817876 105 tensorrt.cc:1409] Created instance riva-trt-citrinet-1024-en-US-asr-offline-am-offline_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0829 20:04:30.818022 105 tensorrt.cc:5305] TRITONBACKEND_ModelInstanceInitialize: riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming_0 (GPU device 0)
I0829 20:04:30.818513 105 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2439, GPU 5178 (MiB)
I0829 20:04:30.823808 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-citrinet-1024-en-US-asr-offline-am-offline' version 1
I0829 20:04:31.156220 105 logging.cc:49] Loaded engine size: 277 MiB
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:31.447480 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3003, GPU 5464 (MiB)
I0829 20:04:31.448846 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +1, GPU +10, now: CPU 3004, GPU 5474 (MiB)
I0829 20:04:31.451456 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +1, GPU +275, now: CPU 1, GPU 1323 (MiB)
I0829 20:04:31.482080 105 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2448, GPU 5466 (MiB)
I0829 20:04:31.483059 105 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2448, GPU 5474 (MiB)
I0829 20:04:31.490937 105 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +532, now: CPU 1, GPU 1855 (MiB)
  > Riva waiting for Triton server to load all models...retrying in 1 second
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:34.159140 105 tensorrt.cc:1409] Created instance riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0829 20:04:34.159613 105 model_repository_manager.cc:1149] successfully loaded 'riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming' version 1
I0829 20:04:34.160377 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-offline:1
I0829 20:04:34.260745 105 model_repository_manager.cc:994] loading: citrinet-1024-en-US-asr-streaming:1
  > Riva waiting for Triton server to load all models...retrying in 1 second
I0829 20:04:34.360959 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-offline' version 1
I0829 20:04:34.361246 105 model_repository_manager.cc:1149] successfully loaded 'citrinet-1024-en-US-asr-streaming' version 1
I0829 20:04:34.361409 105 server.cc:522] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0829 20:04:34.361562 105 server.cc:549] 
+-------------------+-----------------------------------------------------------------------------+--------+
| Backend           | Path                                                                        | Config |
+-------------------+-----------------------------------------------------------------------------+--------+
| onnxruntime       | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so             | {}     |
| riva_asr_decoder  | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so   | {}     |
| tensorrt          | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so                   | {}     |
| riva_asr_vad      | /opt/tritonserver/backends/riva_asr_vad/libtriton_riva_asr_vad.so           | {}     |
| riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {}     |
| riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {}     |
+-------------------+-----------------------------------------------------------------------------+--------+

I0829 20:04:34.361768 105 server.cc:592] 
+-------------------------------------------------------------------------+---------+--------+
| Model                                                                   | Version | Status |
+-------------------------------------------------------------------------+---------+--------+
| citrinet-1024-en-US-asr-offline                                         | 1       | READY  |
| citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline                 | 1       | READY  |
| citrinet-1024-en-US-asr-offline-feature-extractor-offline               | 1       | READY  |
| citrinet-1024-en-US-asr-offline-voice-activity-detector-ctc-offline     | 1       | READY  |
| citrinet-1024-en-US-asr-streaming                                       | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming             | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-feature-extractor-streaming           | 1       | READY  |
| citrinet-1024-en-US-asr-streaming-voice-activity-detector-ctc-streaming | 1       | READY  |
| riva-punctuation-en-US                                                  | 1       | READY  |
| riva-trt-citrinet-1024-en-US-asr-offline-am-offline                     | 1       | READY  |
| riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming                 | 1       | READY  |
| riva-trt-riva-punctuation-en-US-nn-bert-base-uncased                    | 1       | READY  |
+-------------------------------------------------------------------------+---------+--------+

I0829 20:04:34.375394 105 metrics.cc:623] Collecting metrics for GPU 0: GRID A100D-8C
I0829 20:04:34.375790 105 tritonserver.cc:1932] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                        |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                       |
| server_version                   | 2.19.0                                                                                                                                                                                       |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0]         | /data/models                                                                                                                                                                                 |
| model_control_mode               | MODE_NONE                                                                                                                                                                                    |
| strict_model_config              | 1                                                                                                                                                                                            |
| rate_limit                       | OFF                                                                                                                                                                                          |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
| cuda_memory_pool_byte_size{0}    | 1000000000                                                                                                                                                                                   |
| response_cache_byte_size         | 0                                                                                                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                                                                                          |
| strict_readiness                 | 1                                                                                                                                                                                            |
| exit_timeout                     | 30                                                                                                                                                                                           |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0829 20:04:34.376740 105 grpc_server.cc:4375] Started GRPCInferenceService at 0.0.0.0:8001
I0829 20:04:34.377056 105 http_server.cc:3075] Started HTTPService at 0.0.0.0:8000
I0829 20:04:34.418180 105 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
  > Triton server is ready...
I0829 20:04:35.298982   267 riva_server.cc:118] Using Insecure Server Credentials
I0829 20:04:35.302084   267 model_registry.cc:112] Successfully registered: citrinet-1024-en-US-asr-offline for ASR
I0829 20:04:35.305709   267 model_registry.cc:112] Successfully registered: citrinet-1024-en-US-asr-streaming for ASR
I0829 20:04:35.327199   267 model_registry.cc:112] Successfully registered: riva-punctuation-en-US for NLP
W0829 20:04:35.377905 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0829 20:04:35.378040 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0829 20:04:35.378113 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
I0829 20:04:35.397414   267 riva_server.cc:158] Riva Conversational AI Server listening on 0.0.0.0:50051
W0829 20:04:35.397500   267 stats_reporter.cc:41] No API key provided. Stats reporting disabled.
W0829 20:04:36.378317 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0829 20:04:36.378600 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0829 20:04:36.378641 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0829 20:04:37.380072 105 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0829 20:04:37.380270 105 metrics.cc:419] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0829 20:04:37.380347 105 metrics.cc:443] Unable to get energy consumption for GPU 0. Status:Success, value:0

rvinobha · September 6, 2022, 5:25pm

Hi @NSDB

Apologies for the delay,

Quick check

Are the client and server running in the same machine or different machine
if they are running on different machines can they be pinged from each other
Port 50051 is not blocked and accessible

Thanks

NSDB · September 11, 2022, 10:25pm

Unable to test before the server was repurposed.

Sorry - will have to try again some other time.

0to1000subs1.0 · May 11, 2024, 12:49pm

I still have the same issue.

Topic		Replies	Views
Solved: Error creating GRPC channel: Unable to establish connection to server Riva riva	2	1729	August 23, 2022
Failed to get riva started Riva riva	7	1846	December 3, 2022
Riva_start.sh will not start the server Riva riva	5	1267	October 31, 2025
Nvidia Riva Connection Riva grpc , riva	8	420	July 16, 2024
Triton server died before reaching ready state. Terminating Riva startup Riva	15	8045	November 8, 2023
NGC RMIRs Error in downloading models Riva riva	17	1362	March 11, 2024
Getting error while instialaizing riva Riva installation , riva	5	1640	June 6, 2022
Riva quickstart 2.11 fails on xavier nx Riva	3	1014	June 29, 2023
Riva_start.sh will not load the models Riva riva	3	1304	April 23, 2024
Riva 2.0 ASR not working Riva	2	926	May 18, 2022

Error creating GRPC channel: Unable to establish connection to server

ISSUE:

UPDATE: (2022.08.19)

Related topics