Ubuntu 16.04 LTS
GPU type:1050Ti
nvidia driver version:390.87
CUDA version:9.0
CUDNN version:7.13
Python version:3.5
TensorRT version: 5.0.2.6
I want to add a 2D depthwise convolution layers in my network.I tried it like this:
import numpy as np
import pycuda.driver as cuda
import pycuda.autoinit
import scipy.stats as st
import tensorrt as trt
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
def _normalize(t):
return t / t.sum()
def postprocess(network):
input_tensor = network.add_input(name='data', dtype=trt.float32, shape=(1, 368, 432))
ksize = 25
nsig = 3.0
interval = (2 * nsig + 1.) / ksize
x = np.linspace(-nsig - interval / 2., nsig + interval / 2., ksize + 1)
y = np.diff(st.norm.cdf(x))
gk = _normalize(np.sqrt(np.outer(y, y)))
filters = np.outer(gk, np.ones([19])).T.reshape((19, 1,1 ksize, ksize))
conv1=network.add_convolution(input=input_tensor,num_output_maps=1,kernel_shape=(25,25),
kernel=filters,num_groups=19,bias=trt.Weights())
conv1.stride=(1,1)
conv1.padding=(12,12)
def build_egnine():
with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network:
builder.max_workspace_size= 1 << 20
postprocess(network)
return builder.build_cuda_engine(network)
build_egnine()
But it didn’t work:
Invoked with: <tensorrt.tensorrt.INetworkDefinition object at 0x7f1c17173bc8>; kwargs: input=<tensorrt.tensorrt.ITensor object at 0x7f1c17173c38>, kernel_shape=(25, 25), kernel=array([[[[…]]]]), num_output_maps=1,num_groups=19,bias=<tensorrt.tensorrt.Weights object as
…>
Here is my question:
1.Why the numpy.array can’t be used as kernel this way.I tried this idea in the official sample networ_api_pytorch_mnist and got the same error.
2.I want to use the num_groups to do the depthwise convolution.Does this idea work?Do I set the parameter and shape correctly?I read https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/Graph/Layers.html about IConvolutionLayer and am not sure my shape setting.