Hi,
I am converting a model from Pytorch → ONNX → TensorRT.
The model has simple conv layers followed by ReLU. I am using trtexec to convert the model to TRT engine and
it says that all layers go on GPU
[TRT] ---------- Layers Running on DLA ----------
[03/07/2023-09:53:32] [I] [TRT] ---------- Layers Running on GPU ----------
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: Reformatting CopyNode for Network Input input
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_0 + Relu_1
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_2 + Relu_3
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_4 + Relu_5
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_7 + Relu_8
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: (Unnamed Layer* 9) [Identity]
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] POOLING: AveragePool_10
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_11 + Relu_12
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_13 + Relu_14
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_15 + Relu_16
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_18 + Relu_19
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: (Unnamed Layer* 20) [Identity]
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] POOLING: AveragePool_21
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_22 + Relu_23
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_24 + Relu_25
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_26 + Relu_27
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_29 + Relu_30
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: (Unnamed Layer* 31) [Identity]
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] POOLING: AveragePool_32
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_33 + Relu_34
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_35 + Relu_36
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_37 + Relu_38
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_40 + Relu_41
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] COPY: (Unnamed Layer* 42) [Identity]
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] POOLING: AveragePool_43
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_44 + Relu_45
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_46 + Relu_47
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_48 + Relu_49
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] CONVOLUTION: Conv_51 + Relu_52
[03/07/2023-09:53:32] [I] [TRT] [GpuLayer] DECONVOLUTION: ConvTranspose_53 + BatchNormalization_54 + Relu_55
Why is this the case? How can I control so that I leverage the DLA for some computation?
Input is [1,1,48,512], meaning batch size=1
The network in pytorch:
def __init__(self,
in_channels: int,
out_channels: int,
kernel_long: Tuple = (7, 3),
kernel_square: int = 3) -> None:
super(NSKBlock, self).__init__()
self.branch1 = BasicConv2d(in_channels, out_channels,
kernel_size=kernel_long, padding=(2, 0))
self.branch2 = BasicConv2d(in_channels, out_channels,
kernel_size=kernel_square)
self.branch3 = BasicConv2d(in_channels, out_channels,
kernel_size=kernel_long[::-1], padding=(0, 2))
self.conv = BasicConv2d(out_channels * 3, out_channels, kernel_size=1, padding=1)