I wonder is there a way I can customize unsupported layers on DLA?
There are multiple documents showing that I can customize CUDA/Tensor Cores for TensorRT e.g. Extending TensorRT with Custom Layers .
And there are some plugin examples in TensorRT Plugins .
Hi,
Sorry for the late update.
Which layer do you need?
Since DLA is a hardware engine, please check the below document to see if your layer can be supported or not first:
<!--- SPDX-License-Identifier: Apache-2.0 -->
# Supported ONNX Operators & Functions on Orin DLA
DLA operator functionality is exposed through the TensorRT builder, which internally links to DLA SW libraries (see [DLA Workflow](https://developer.nvidia.com/deep-learning-accelerator)). While some ONNX operators or functions may already be available in DLA SW, TensorRT may not expose them yet.
See below for the support matrix of ONNX operators & functions on Orin DLA. If you are interested in a specific DLA operator that is not supported through TensorRT yet, feel free to raise a [GitHub Issue](https://github.com/NVIDIA/Deep-Learning-Accelerator-SW/issues) and/or inform your NVIDIA representative (in particular for NVIDIA DRIVE customers).
See [General Restrictions](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#dla-lay-supp-rest) that apply to all operations below. Many of those ops are supported on Xavier DLA as well, see [Layer Support and Restrictions](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#dla-lay-supp-rest).
TensorRT 8.6 supports operators up to Opset 17. Latest information of ONNX operators can be found [here](https://github.com/onnx/onnx/blob/master/docs/Operators.md).
Note that the scripts in `op_reconstruction/` are intended as a recipe for how ops currently not supported on DLA can be decomposed into supported ops. Depending on your setup, you may choose to perform such op reconstructions in the ONNX domain post-training (as done here) or during the training process (for example in TensorFlow or PyTorch). The case of "Native" in the DLA SW support column and "Reconstruction" in the TensorRT support column indicates that an op can be supported through TensorRT by decomposing it into other DLA ops already supported by TensorRT.
Below Operator Support Matrix requires the following minimum system config (the OS by default gets shipped with the DLA SW and TensorRT versions to its right):
| **Hardware platform** | **OS** | **DLA SW version** | **TensorRT version** |
| ----------------- | ---------------- | -------------- | ---------------- |
| DRIVE Orin (Automotive) | DRIVE OS 6.0.6.0 | DLA 3.12.0 | TensorRT 8.5.10 |
| Jetson Orin (Embedded) | JetPack 5.1.1 | DLA 3.12.1 | TensorRT 8.5.2 |
| DRIVE Orin (Automotive) | DRIVE OS 6.0.7.0 | DLA 3.13.0 | TensorRT 8.6.10 |
This file has been truncated. show original
Thanks.
Hi,
Sorry for the late reply.
I want to deploy an Unet-like neural network to DLA, e.g. from the official Unet repo.
self.inc = (DoubleConv(n_channels, 64))
self.down1 = (Down(64, 128))
self.down2 = (Down(128, 256))
self.down3 = (Down(256, 512))
factor = 2 if bilinear else 1
self.down4 = (Down(512, 1024 // factor))
self.up1 = (Up(1024, 512 // factor, bilinear))
self.up2 = (Up(512, 256 // factor, bilinear))
self.up3 = (Up(256, 128 // factor, bilinear))
self.up4 = (Up(128, 64, bilinear))
self.outc = (OutConv(64, n_classes))
def forward(self, x):
x1 = self.inc(x)
x2 = self.down1(x1)
x3 = self.down2(x2)
x4 = self.down3(x3)
x5 = self.down4(x4)
x = self.up1(x5, x4)
x = self.up2(x, x3)
There are two upscaling options from the network, one is to use a resize layer to perform a bilinear upscaling, and another is to use a deconvolution layer(ConvTranspose2d) to perform the upscaling.
class Up(nn.Module):
"""Upscaling then double conv"""
def __init__(self, in_channels, out_channels, bilinear=True):
super().__init__()
# if bilinear, use the normal convolutions to reduce the number of channels
if bilinear:
self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
self.conv = DoubleConv(in_channels, out_channels, in_channels // 2)
else:
self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2)
self.conv = DoubleConv(in_channels, out_channels)
def forward(self, x1, x2):
x1 = self.up(x1)
# input is CHW
diffY = x2.size()[2] - x1.size()[2]
diffX = x2.size()[3] - x1.size()[3]
x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,
I have tested several times that the performance between these two layers is quite difference. DLA does not support the resize layer in Unet but deconvolution.
I wonder is there a way I can customize unsupported layers on DLA? Or DLA is the fixed-function hardware that accelerates specific deep-learning layers, I can not customize unsupported layers.
Thank you.
Hi,
You can find the details below. For a resize layer, DLA only supports integer scaling.
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#dla-lay-supp-rest
The last two elements in scales, representing the scale values along height and width dimensions, respectively, must be integer values in the range of [1, 32] in nearest-neighbor mode and [1, 4] in bilinear mode.
Does your model meet the requirements?
Thanks.
The official Unet model does not meet the requirements. I will use deconvolution instead of bilinear interpolation.
Thanks.
system
Closed
October 8, 2024, 5:00am
9
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.