GPU Utilization

uss0403 · August 21, 2023, 4:49am

Hello! I have a question.

Currently, I am performing performance testing of Inference using TensorRT. I have also applied MPS for further optimization.

Prior to converting with TensorRT, applying MPS showed an improvement of about 20-30% in FPS. However, after converting with TensorRT, applying MPS yields almost similar performance.

I am curious if there are any similar cases or reasons for this.

I wonder if using nvidia-smi is a reliable way to measure GPU utilization when examining GPU usage. Additionally, I would like to know if there is any command provided by NVIDIA to obtain information about CUDA cores or Tensor Cores on the GPU.

I would greatly appreciate your response.

AakankshaS · August 21, 2023, 3:07pm

Hi,
Can you try running your model with trtexec command, and share the “”–verbose"" log in case if the issue persist

You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation

github.com

onnx/onnx-tensorrt/blob/main/docs/operators.md

<!--- SPDX-License-Identifier: Apache-2.0 -->

# Supported ONNX Operators

TensorRT 8.6 supports operators up to Opset 17. Latest information of ONNX operators can be found [here](https://github.com/onnx/onnx/blob/master/docs/Operators.md)

TensorRT supports the following ONNX data types: DOUBLE, FLOAT32, FLOAT16, INT8, and BOOL

> Note: There is limited support for INT32, INT64, and DOUBLE types. TensorRT will attempt to cast down INT64 to INT32 and DOUBLE down to FLOAT, clamping values to `+-INT_MAX` or `+-FLT_MAX` if necessary.

See below for the support matrix of ONNX operators in ONNX-TensorRT.

## Operator Support Matrix

| Operator                  | Supported  | Supported Types | Restrictions                                                                                                           |
|---------------------------|------------|-----------------|------------------------------------------------------------------------------------------------------------------------|
| Abs                       | Y          | FP32, FP16, INT32 |
| Acos                      | Y          | FP32, FP16 |
| Acosh                     | Y          | FP32, FP16 |
| Add                       | Y          | FP32, FP16, INT32 |

This file has been truncated. show original

Also, request you to share your model and script if not shared already so that we can help you better.

Meanwhile, for some common errors and queries please refer to below link:

Thanks!

uss0403 · August 22, 2023, 7:00am

Thank you for your reply.

What I am curious about is whether it is correct to consider GPU utilization and the utilization reported by nvidia-smi as the same thing. I have doubts because it seems that even when the CUDA cores of the GPU are not being fully utilized, the GPU utilization still shows 100%.

I would like to know more about this.

spolisetty · August 29, 2023, 10:05am

Hi,

We recommend you to please try profiling for more accurate utilization information.

https://docs.nvidia.com/nsight-systems/UserGuide/index.html
https://docs.nvidia.com/deeplearning/frameworks/dlprof-user-guide/

Thank you.

Topic		Replies	Views
TensorRT Inference Consuming Large Amount of System Resources TensorRT	1	594	July 5, 2022
How to measure Tensor core utilization using NVIDIA profiling tools such as Nsight System, DLProf, nvprof etc TensorRT cudnn	4	1521	January 31, 2024
TensorRT batch inference - How to be sure one kernel does use all the GPU ressources? TensorRT tensorrt , nsight	3	788	May 18, 2021
How to monitor tensor cores utilization? TensorRT	5	4649	August 5, 2021
Does TensorRT exploit parallelism in a computational graph during inference? TensorRT	2	526	April 18, 2023
How to reduce differences in inference output across gpus when using fp16? TensorRT cudnn	0	338	December 22, 2023
Does TensorRT support GeForce and Quadro boards TensorRT	1	657	April 6, 2023
What Nvidia GPUs can I use for TensorRT execution provider in ONNX runtime? TensorRT	6	1151	October 12, 2021
What will the engine's precision be if setting with unsupported precision? TensorRT tensorrt	5	655	October 12, 2021
How can I access the same TensorRT engine model in different thread TensorRT cudnn	1	557	November 27, 2023

GPU Utilization

Related topics