Convolution question

Hi,
I’m trying to run a network with a simple 3x3 IConvolutionLayer on a gray scale image I’ve read using OpenCV.
The kernel is 0,0,0,0,1,0,0,0,0 so I should get the same output (only one image). However I get the output smeared… I’ve seen it might be because the input image is of HxW dim and the convolution output is CxHxW and I should remove the C dimension?

// Code is something like this:
m_input = cv::imread(m_configuration.m_image_file_name, CV_LOAD_IMAGE_GRAYSCALE);

m_input_tensor = m_network->addInput(NETWORK_INPUT_NAME, dt, DimsCHW{1, m_input.rows, m_input.cols});

// Kernel sizes and kernel weights are 3x3

IConvolutionLayer *conv1 = m_network->addConvolution(*m_input_tensor, 1, m_kernel_sizes, m_kernel_weights, m_empty_weights);

m_network->markOutput(*conv1->getOutput(0));

cudaMemcpy(host_output_image, m_device_output_image, image_size * sizeof(TENSOR_DATA), cudaMemcpyDeviceToHost);

Any tips would be greatly appriciated… If I use a IScaleLayer instead of the IConvolutionLayer, it works just fine.

thanks
Eyal

Hi @eyalhir74,

What do you mean with “smeared output”? Can you share images to see what you are getting?

As you are using Jetson Xavier it would be possible to use the new Nvidia’s VPI Library.

You can review the convolution example here: https://docs.nvidia.com/vpi/sample_conv2d.html
I have tested it by using images from OpenCV, with different kernel sizes such as 3x3, 5x5, 7x7 and it works like a charm.

Regards,
Fabian
www.ridgerun.com

Hi Fabian,
Thanks for the response. I rather stay currently with TensorRT API.
Left image is the source, right image is what I get after the convolution and the code in the previous post.

thanks
Eyal

Oh yes I have seen this behaviour before, not while working with TensorRT though. You may want to check the docs for the IConvolutionLayer here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_convolution_layer.html

I suspect it might be related with mismatch about the stride or the padding.

Stride is the number of pixels shifts over the input. For example, when the stride is 1 the the filter is moved to 1 pixel at a time. When the stride is 2 then the filter is moved to 2 pixels at a time and so on. The below figure shows how convolution works with a stride of 2.

Padding is needed sometimes when the filter does not perfectly fit the input. And the problem can be attacked by
pad the picture with zeros (zero-padding) so that it fits or by dropping the part of the image where the filter did not fit. For example:

The IConvolutionLayer has methods to set the stride and the padding.

Regards,
Fabian
www.ridgerun.com

2 Likes

Hi Fabian,
Cool!! seems like this indeed relates to padding. I’ve added a (1,1) padding and the convolution went ok :)

Many thanks
Eyal