Hi,

I am converting a model from Pytorch → ONNX → TensorRT.

The model has simple conv layers followed by ReLU. I am using trtexec to convert the model to TRT engine and

it says that all layers go on GPU

[TRT] ---------- Layers Running on DLA ----------

[03/07/2023-09:53:32] [I] [TRT] ---------- Layers Running on GPU ----------

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: Reformatting CopyNode for Network Input input

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_0 + Relu_1

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_2 + Relu_3

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_4 + Relu_5

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_7 + Relu_8

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: (Unnamed Layer* 9) [Identity]

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] POOLING: AveragePool_10

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_11 + Relu_12

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_13 + Relu_14

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_15 + Relu_16

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_18 + Relu_19

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: (Unnamed Layer* 20) [Identity]

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] POOLING: AveragePool_21

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_22 + Relu_23

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_24 + Relu_25

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_26 + Relu_27

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_29 + Relu_30

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: (Unnamed Layer* 31) [Identity]

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] POOLING: AveragePool_32

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_33 + Relu_34

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_35 + Relu_36

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_37 + Relu_38

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_40 + Relu_41

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: (Unnamed Layer* 42) [Identity]

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] POOLING: AveragePool_43

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_44 + Relu_45

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_46 + Relu_47

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_48 + Relu_49

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_51 + Relu_52

[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] DECONVOLUTION: ConvTranspose_53 + BatchNormalization_54 + Relu_55

Why is this the case? How can I control so that I leverage the DLA for some computation?

Input is [1,1,48,512], meaning batch size=1

The network in pytorch:

```
def __init__(self,
in_channels: int,
out_channels: int,
kernel_long: Tuple = (7, 3),
kernel_square: int = 3) -> None:
super(NSKBlock, self).__init__()
self.branch1 = BasicConv2d(in_channels, out_channels,
kernel_size=kernel_long, padding=(2, 0))
self.branch2 = BasicConv2d(in_channels, out_channels,
kernel_size=kernel_square)
self.branch3 = BasicConv2d(in_channels, out_channels,
kernel_size=kernel_long[::-1], padding=(0, 2))
self.conv = BasicConv2d(out_channels * 3, out_channels, kernel_size=1, padding=1)
```