TensorRT explicit batch with DLA

alexandru.cocinda · July 26, 2022, 1:56pm

I am implementing my own GStreamer inference plugin with TensorRT and while looking at the DeepStream gst-nvinfer source code I found that when the DLA is used, a network with implicit dimensions is created in nvdsinfer_model_builder.cpp. However, in nvdsinfer_backend.cpp we can find a backend context for an engine that has both an explicit batch and DLA support:

    if (!(*engine)->hasImplicitBatchDimension())
    {
        /* Engine built with fulldims support */
        assert((*engine)->getNbOptimizationProfiles() > 0);

        if (engine->hasDla())
        {
            backend = std::make_unique<DlaFullDimTrtBackendContext>(
                std::move(cudaCtx), engine, DEFAULT_CONTEXT_PROFILE_IDX);
        }
        else
        {
            backend = std::make_unique<FullDimTrtBackendContext>(
                std::move(cudaCtx), engine, DEFAULT_CONTEXT_PROFILE_IDX);
        }
    }

Is it possible then to build a TRT engine with explicit batch that can run on DLA?

NVES · July 26, 2022, 2:37pm

Hi,
Can you try running your model with trtexec command, and share the “”–verbose"" log in case if the issue persist

You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation

github.com

onnx/onnx-tensorrt/blob/main/docs/operators.md

<!--- SPDX-License-Identifier: Apache-2.0 -->

# Supported ONNX Operators

TensorRT 8.4 supports operators up to Opset 17. Latest information of ONNX operators can be found [here](https://github.com/onnx/onnx/blob/master/docs/Operators.md)

TensorRT supports the following ONNX data types: DOUBLE, FLOAT32, FLOAT16, INT8, and BOOL

> Note: There is limited support for INT32, INT64, and DOUBLE types. TensorRT will attempt to cast down INT64 to INT32 and DOUBLE down to FLOAT, clamping values to `+-INT_MAX` or `+-FLT_MAX` if necessary.

See below for the support matrix of ONNX operators in ONNX-TensorRT.

## Operator Support Matrix

| Operator                  | Supported  | Supported Types | Restrictions                                                                                                           |
|---------------------------|------------|-----------------|------------------------------------------------------------------------------------------------------------------------|
| Abs                       | Y          | FP32, FP16, INT32 |
| Acos                      | Y          | FP32, FP16 |
| Acosh                     | Y          | FP32, FP16 |
| Add                       | Y          | FP32, FP16, INT32 |

This file has been truncated. show original

Also, request you to share your model and script if not shared already so that we can help you better.

Meanwhile, for some common errors and queries please refer to below link:

Thanks!

alexandru.cocinda · July 26, 2022, 2:42pm

Hi,

Currently there is no issue, it will take me some time to finish the plugin and test it. However, this is something that the engineering team should be aware of and clearly state in the documentation. Is it possible to forward this question to them?

spolisetty · July 27, 2022, 4:26pm

Hi,

Hope the following doc may help you. If you need further assistance, we would like to move this post to the Deepstream forum to get better help.

Thank you.

alexandru.cocinda · July 28, 2022, 10:01am

No, it does not help. I have already read the DLA part from the docs. Forget about DeepStream, my question is TRT related:

Can one create a network with an explicit batch if a DLA core is used?

spolisetty · August 2, 2022, 6:42am

Hi,

Explicit batch is always allowed for DLA.
DLA allows the user to use “implicit batch” mode, but it can only run the “max batch”.

Thank you.

alexandru.cocinda · August 2, 2022, 6:56am

Then why is the documentation using an implicit batch in the example?

spolisetty · August 2, 2022, 8:35am

Hi,

Sorry, missed conveying another point.
Please refer to my edited response.

Thank you.

alexandru.cocinda · August 2, 2022, 10:04am

Alright, thank you for the clarification!

system · August 16, 2022, 10:05am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Run a part of DNN on DLA and part of DNN on GPU Jetson AGX Xavier dla	7	1203	February 14, 2023
Convert model to TensorRT with DLA \| DLA Node compilation Failed TensorRT	3	917	October 12, 2021
Tensorrt Python API has a bug in DLA usage Jetson AGX Xavier tensorrt	11	632	August 17, 2022
Batch Inference Wrong in Python API TensorRT	15	3554	October 12, 2021
TensorRT run ONNX model with Int8 issue TensorRT	9	4217	October 12, 2021
Regarding doubts about deepstream custom parser for onnx with deepstream batch DeepStream SDK gstreamer , deepstream	5	37	September 14, 2024
How to make context on DLA? Jetson Xavier NX dla	6	630	November 27, 2023
Generate Dynamic batch size engine with tensorrt for DLA based CNN Inference Jetson AGX Orin tensorrt , dla , onnx	2	53	September 30, 2024
Some question about using dual DLA of jetson xavier nx DeepStream SDK	8	2222	October 12, 2021
How can I customize matrix multiplication on DLA Jetson AGX Orin dla	12	201	September 25, 2024

TensorRT explicit batch with DLA

Related topics