Enable cpu metrics

When using riva what is the recommended method to also enable cpu metrics. I noticed it isn’t enabled, or is it enabled on another port/path?

I was reading this and tried to add the arguments in the riva_start script.
Metrics — NVIDIA Triton Inference Server

That didn’t work though.

Hi @ryein

Thanks for your interest in Riva

I will check with the internal team and let you know the updates

Thanks

Thanks. Really curious to know how to do it.

HI @ryein

I hear from the internal team that cpu metrics is suppose to work

Can you share the further details from riva perspective how you are using it, the command that you use (the complete command you use tried for cpu metric)
Also does it show some error, if yes, please share those details too with us

Thanks

Right now I am just using the riva start script. Do I need to add some config option to the config.sh file? I tried added the arguments for the docker start in the riva_start script, but those do not work. The things I tried are the ones in the documentation.

Here is a complete dump of the metrics.

# HELP nv_inference_request_success Number of successful inference requests, all batch sizes
# TYPE nv_inference_request_success counter
nv_inference_request_success{model="citrinet-1024-en-US-asr-streaming",version="1"} 190062.000000
nv_inference_request_success{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_request_success{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 190062.000000
nv_inference_request_success{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_request_success{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_request_success{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_request_success{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_request_success{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_request_success{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_request_success{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_request_success{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_request_success{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_request_success{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_request_success{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_request_success{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_request_success{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_request_success{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_request_success{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_request_success{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_request_success{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_request_success{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_request_success{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_inference_request_failure Number of failed inference requests, all batch sizes
# TYPE nv_inference_request_failure counter
nv_inference_request_failure{model="citrinet-1024-en-US-asr-streaming",version="1"} 801.000000
nv_inference_request_failure{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_request_failure{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_request_failure{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_request_failure{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_request_failure{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_request_failure{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_request_failure{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_request_failure{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_request_failure{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_request_failure{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_request_failure{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_request_failure{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_request_failure{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_request_failure{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_request_failure{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_request_failure{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_request_failure{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_request_failure{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_request_failure{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_request_failure{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_request_failure{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_inference_count Number of inferences performed (does not include cached requests)
# TYPE nv_inference_count counter
nv_inference_count{model="citrinet-1024-en-US-asr-streaming",version="1"} 190062.000000
nv_inference_count{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_count{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 190062.000000
nv_inference_count{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_count{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_count{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_count{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_count{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_count{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_count{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_count{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_count{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_count{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_count{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_count{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_count{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_count{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_count{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_count{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_count{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_count{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_count{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_inference_exec_count Number of model executions performed (does not include cached requests)
# TYPE nv_inference_exec_count counter
nv_inference_exec_count{model="citrinet-1024-en-US-asr-streaming",version="1"} 190062.000000
nv_inference_exec_count{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_exec_count{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 190062.000000
nv_inference_exec_count{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_exec_count{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_exec_count{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_exec_count{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_exec_count{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_exec_count{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_exec_count{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_exec_count{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_exec_count{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_exec_count{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_exec_count{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_exec_count{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_exec_count{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_exec_count{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_exec_count{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_exec_count{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_exec_count{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_exec_count{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_exec_count{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_inference_request_duration_us Cumulative inference request duration in microseconds (includes cached requests)
# TYPE nv_inference_request_duration_us counter
nv_inference_request_duration_us{model="citrinet-1024-en-US-asr-streaming",version="1"} 1890270635.000000
nv_inference_request_duration_us{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 1227529421.000000
nv_inference_request_duration_us{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_request_duration_us{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_request_duration_us{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_request_duration_us{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_request_duration_us{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_request_duration_us{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_request_duration_us{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_request_duration_us{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_request_duration_us{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_request_duration_us{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_request_duration_us{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_inference_queue_duration_us Cumulative inference queuing duration in microseconds (includes cached requests)
# TYPE nv_inference_queue_duration_us counter
nv_inference_queue_duration_us{model="citrinet-1024-en-US-asr-streaming",version="1"} 308670.000000
nv_inference_queue_duration_us{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 224883003.000000
nv_inference_queue_duration_us{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_queue_duration_us{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_queue_duration_us{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_queue_duration_us{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_queue_duration_us{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_queue_duration_us{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_queue_duration_us{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_queue_duration_us{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_queue_duration_us{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_queue_duration_us{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_queue_duration_us{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_inference_compute_input_duration_us Cumulative compute input duration in microseconds (does not include cached requests)
# TYPE nv_inference_compute_input_duration_us counter
nv_inference_compute_input_duration_us{model="citrinet-1024-en-US-asr-streaming",version="1"} 29441679.000000
nv_inference_compute_input_duration_us{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 29441679.000000
nv_inference_compute_input_duration_us{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_compute_input_duration_us{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_inference_compute_infer_duration_us Cumulative compute inference duration in microseconds (does not include cached requests)
# TYPE nv_inference_compute_infer_duration_us counter
nv_inference_compute_infer_duration_us{model="citrinet-1024-en-US-asr-streaming",version="1"} 962982552.000000
nv_inference_compute_infer_duration_us{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 962982552.000000
nv_inference_compute_infer_duration_us{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_compute_infer_duration_us{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_inference_compute_output_duration_us Cumulative inference compute output duration in microseconds (does not include cached requests)
# TYPE nv_inference_compute_output_duration_us counter
nv_inference_compute_output_duration_us{model="citrinet-1024-en-US-asr-streaming",version="1"} 4869094.000000
nv_inference_compute_output_duration_us{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 4869094.000000
nv_inference_compute_output_duration_us{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_inference_compute_output_duration_us{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_cache_num_entries Number of responses stored in response cache
# TYPE nv_cache_num_entries gauge
# HELP nv_cache_num_lookups Number of cache lookups in response cache
# TYPE nv_cache_num_lookups gauge
# HELP nv_cache_num_hits Number of cache hits in response cache
# TYPE nv_cache_num_hits gauge
# HELP nv_cache_num_misses Number of cache misses in response cache
# TYPE nv_cache_num_misses gauge
# HELP nv_cache_num_evictions Number of cache evictions in response cache
# TYPE nv_cache_num_evictions gauge
# HELP nv_cache_lookup_duration Total cache lookup duration (hit and miss), in microseconds
# TYPE nv_cache_lookup_duration gauge
# HELP nv_cache_insertion_duration Total cache insertion duration, in microseconds
# TYPE nv_cache_insertion_duration gauge
# HELP nv_cache_util Cache utilization [0.0 - 1.0]
# TYPE nv_cache_util gauge
# HELP nv_cache_num_hits_per_model Number of cache hits per model
# TYPE nv_cache_num_hits_per_model counter
nv_cache_num_hits_per_model{model="citrinet-1024-en-US-asr-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_cache_num_hits_per_model{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_cache_num_hits_per_model{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_num_hits_per_model{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_cache_hit_lookup_duration_per_model Total cache hit lookup duration per model, in microseconds
# TYPE nv_cache_hit_lookup_duration_per_model counter
nv_cache_hit_lookup_duration_per_model{model="citrinet-1024-en-US-asr-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_hit_lookup_duration_per_model{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_cache_num_misses_per_model Number of cache misses per model
# TYPE nv_cache_num_misses_per_model counter
nv_cache_num_misses_per_model{model="citrinet-1024-en-US-asr-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_cache_num_misses_per_model{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_cache_num_misses_per_model{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_num_misses_per_model{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_cache_miss_lookup_duration_per_model Total cache miss lookup duration per model, in microseconds
# TYPE nv_cache_miss_lookup_duration_per_model counter
nv_cache_miss_lookup_duration_per_model{model="citrinet-1024-en-US-asr-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_miss_lookup_duration_per_model{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_cache_miss_insertion_duration_per_model Total cache miss insertion duration per model, in microseconds
# TYPE nv_cache_miss_insertion_duration_per_model counter
nv_cache_miss_insertion_duration_per_model{model="citrinet-1024-en-US-asr-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="riva-trt-conformer-en-US-asr-offline-am-streaming-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="riva-trt-citrinet-1024-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="riva-trt-citrinet-1024-en-US-asr-offline-am-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="citrinet-1024-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="citrinet-1024-en-US-asr-offline-feature-extractor-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="conformer-en-US-asr-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="riva-trt-conformer-en-US-asr-streaming-am-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="citrinet-1024-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="citrinet-1024-en-US-asr-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="conformer-en-US-asr-streaming-feature-extractor-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="citrinet-1024-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="citrinet-1024-en-US-asr-offline-endpointing-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="conformer-en-US-asr-offline-feature-extractor-streaming-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="citrinet-1024-en-US-asr-offline-ctc-decoder-cpu-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="conformer-en-US-asr-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="riva-trt-riva-punctuation-en-US-nn-bert-base-uncased",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="conformer-en-US-asr-offline-endpointing-streaming-offline",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="conformer-en-US-asr-streaming-endpointing-streaming",version="1"} 0.000000
nv_cache_miss_insertion_duration_per_model{model="riva-punctuation-en-US",version="1"} 0.000000
# HELP nv_gpu_utilization GPU utilization rate [0.0 - 1.0)
# TYPE nv_gpu_utilization gauge
nv_gpu_utilization{gpu_uuid="GPU-1e77b23b-e713-6cda-1b5d-b7e0a09f2b33"} 0.000000
# HELP nv_gpu_memory_total_bytes GPU total memory, in bytes
# TYPE nv_gpu_memory_total_bytes gauge
nv_gpu_memory_total_bytes{gpu_uuid="GPU-1e77b23b-e713-6cda-1b5d-b7e0a09f2b33"} 25769803776.000000
# HELP nv_gpu_memory_used_bytes GPU used memory, in bytes
# TYPE nv_gpu_memory_used_bytes gauge
nv_gpu_memory_used_bytes{gpu_uuid="GPU-1e77b23b-e713-6cda-1b5d-b7e0a09f2b33"} 12369002496.000000
# HELP nv_gpu_power_usage GPU power usage in watts
# TYPE nv_gpu_power_usage gauge
nv_gpu_power_usage{gpu_uuid="GPU-1e77b23b-e713-6cda-1b5d-b7e0a09f2b33"} 17.664000
# HELP nv_gpu_power_limit GPU power management limit in watts
# TYPE nv_gpu_power_limit gauge
nv_gpu_power_limit{gpu_uuid="GPU-1e77b23b-e713-6cda-1b5d-b7e0a09f2b33"} 350.000000
# HELP nv_energy_consumption GPU energy consumption in joules since the Triton Server started
# TYPE nv_energy_consumption counter
nv_energy_consumption{gpu_uuid="GPU-1e77b23b-e713-6cda-1b5d-b7e0a09f2b33"} 25086026.825000