Performance Analyzer cannot collect metrics on Jetson Xavier

Hi.

I have deployed Triton on Jetson Xavier and used Performance Analyzer to measure model performance during inference. Latency and throughput are correctly measured, but when I try to collect metrics using --collect-metrics option, the following message appears:

WARNING: Unable to parse ‘nv_gpu_utilization’ metric.
WARNING: Unable to parse ‘nv_gpu_power_usage’ metric.
WARNING: Unable to parse ‘nv_gpu_memory_used_bytes’ metric.
WARNING: Unable to parse ‘nv_gpu_memory_total_bytes’ metric.

The command I am using to launch the inferences is:
/usr/local/bin/perf_analyzer --collect-metrics -m 3D_fp32_05_batchd -b 1 --concurrency-range 1

I do not know if some additional features must be activated on Jetson Xavier to obtain those metrics.

Best regards,
Nicolas

Hi,

Some profiling functions might not be supported on Jetson due to the iGPU environment.
Do you still get latency and throughput once using the --collect-metrics option?

Thanks.

Hi,

Yes, I can get latency and throughput values. Just to making it clear, I am showing you a short output of the command:

perf_analyzer -m 2D_best_15_batchd --collect-metrics -f output.csv --verbose-csv

*** Measurement Settings ***
Batch size: 1
Service Kind: Triton
Using “time_windows” mode for stabilization
Measurement window: 5000 msec
Using synchronous calls for inference
Stabilizing using average latency

Request concurrency: 1
WARNING: Unable to parse ‘nv_gpu_utilization’ metric.
WARNING: Unable to parse ‘nv_gpu_power_usage’ metric.
WARNING: Unable to parse ‘nv_gpu_memory_used_bytes’ metric.
WARNING: Unable to parse ‘nv_gpu_memory_total_bytes’ metric.
Client:
Request count: 10465
Throughput: 581.241 infer/sec
Avg latency: 1718 usec (standard deviation 94 usec)
p50 latency: 1696 usec
p90 latency: 1846 usec
p95 latency: 1874 usec
p99 latency: 1925 usec
Avg HTTP time: 1711 usec (send/recv 144 usec + response wait 1567 usec)
················

As you can see, warnings appear at the beginning, but latency and throughput values are still correctly calculated.

Best regards,

Hi,

Since iGPU uses the shared memory system (physical memory shared by CPU and GPU).
So memory related profiling might not be supported.

Thanks.

Hi,

thanks for your answer.