Detecnet from Tao not running on DLA via Deepstream on Jetson NX

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson Xavier NX Dev Kit)
• DeepStream Version 6.2
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.5.2.2

• Issue Type( questions)

I run the default deepstream app using:
‘deepstream-app -c my_config_file.txt’

The model I am attempting to run using nv-infer is a Detecnet resenet-34 training on Tao (unpruned).

Upon running it with enable-dla=0
I get the following error for every layer.

0:00:02.106876928 3243 0xaaaad751f600 WARN nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1170> [UID = 1]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
0:00:02.109492864 3243 0xaaaad751f600 INFO nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
WARNING: [TRT]: The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.
WARNING: DLA does not support FP32 precision type, using FP16 mode.
WARNING: [TRT]: Layer ‘output_bbox/bias’ (CONSTANT): Unsupported on DLA. Switching this layer’s device type to GPU.
WARNING: [TRT]: Layer ‘conv1/kernel’ (CONSTANT): Unsupported on DLA. Switching this layer’s device type to GPU.
WARNING: [TRT]: Layer ‘conv1/bias’ (CONSTANT): Unsupported on DLA. Switching this layer’s device type to GPU.
WARNING: [TRT]: Layer ‘bn_conv1/moving_variance’ (CONSTANT): Unsupported on DLA. Switching this layer’s device type to GPU.
WARNING: [TRT]: Layer ‘bn_conv1/Reshape_1/shape’ (CONSTANT): Unsupported on DLA. Switching this layer’s device type to GPU.
WARNING: [TRT]: Layer ‘bn_conv1/batchnorm/add/y’ (CONSTANT): Unsupported on DLA. Switching this layer’s device type to GPU.
WARNING: [TRT]: Layer ‘bn_conv1/gamma’ (CONSTANT): Unsupported on DLA. Switching this layer’s device type to GPU.

I am intending to run an Object Detection pipeline using the DLAs onboard. Request input with how to fix this or which Tao based object detection model will work with DLAs.

Thank you in advance

  1. could you share the nvinfer’s configuration file? did you set enable-dla and use-dla-core? please refer to nvinfer for more explaination.
  2. noticing it is warning log, how do you know every layer get error?

Hi. Thank you for your reply.

  1. nvinfer config file attached. config_infer_primary (copy).txt (967 Bytes)

did you set enable-dla and use-dla-core? → I used enable-dla

  1. Error on my part to call it an error! :)
    However, I want to push this workload onto the DLA, so as to free up GPU load for other functionality. Here the DLA is failing and its falling back onto the GPU.

Look forward to your response.

from the logs, DLA does not support FP32 precision type. are that “output_bbox/bias” layer FP32 precision type? are all layer FP32 precision type?

In the config file I specified it for FP16.

0=FP32, 1=INT8, 2=FP16 mode

network-mode=2

Is this something I would need to do before exporting the etlt from Tao?

Please refer to Transfer Learning Toolkit DLA warning messages - #7 by Morganh
Actually you can ignore the warning info which mentions that the layer does not support DLA.
The layer have been fused into another layer which DLA supports.

Hi Morgan,

I set the following relevant settings in the config file:

enable-dla=1
use-dla-core=0
infer-dims=3;384;1248 (do I need to reduce this size? )

DLA core seems to have been activated when I used JTOP to visualize what is happening, however, there is still significant GPU usage.

I’ve attached the output log here:
log.txt (64.0 KB)

The model I am trying to run is a custom trained detecnet (resenet34 backbone). I have done NO pruning, int8 calib or QAT on this model. Do I need to do any of those?

There is no update from you for a period, assuming this is not an issue any more. Hence we are closing this topic. If need further support, please open a new one. Thanks.

You can run official deepstream_tao_app(GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream) along with the peoplenet etlt model
PeopleNet | NVIDIA NGC
Set infer-dims=3;544;960

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.