Conv Output Dimensions confusing

Cant seem to understand the output of this convolution.

Hi, I’m coding the beginnings of a conv c++ class using cudnn, I already did
so using pure cpu, and I’m trying to get my head around some of the calls.

I have, as input, two 3x3 features, so, in NCHW format I=1233.

I have a single 2x2 filter, again, in NCHW format, F=1122.

I expect as output, from my cpu code, O=1222.

Now, to correctly configure the filters for the convolution, the N and C terms become:
(from the API reference)

K represents the number of output feature maps C is the number of input feature maps

Now… C is the c from the previous layer right? Inputs have two channels, so C=2.

But what is K?
Now, how does one compute K? To me, what I expect is for K to be 2.

So, with that in mind, we redo the dimensions of the filter to be (KCHW) F=2222

When I run my code, instead of geting, on the output, two channels of 2x2, I get back 8
values, the first 4 are the sum of what I expect the sum of the output channles would be,
and the last 4 values are zeroed.

So, even though I ran cudnnGetConvolutionNdForwardOutputDim, and it returned O=1222, and
even though I did get 8 values out, the last 4 are zeroes, and the first 4 seems to be the
sum of the expected two output channels…

Any idea where I’m messing this up?

Convolution is set up as CUDNN_CROSS_CORRELATION.
Forward algo selected is: CUDNN_CONVOLUTION_FWD_ALGO_GEMM


input values (dimensions are 1233), from 1 to 18:
1 2 3 10 11 12
4 5 6 13 14 15
7 8 9 16 17 18

filter values:
7 11
13 17

expected output:
166 214 598 646
310 358 742 790

actual output:
764 860 0 0
1052 1148 0 0

Thank you to anyone helping out.

So I ran a few more experiments with data of diferent dimensions, and I got this to work with
the following parameters:

Input Dimensions: NCHW = 2,1,3,3
(before, I was setting the input dimension as 1,2,3,3)

Filter Dimensions: KCHW = 2,1,2,2
(from what I gather from manuals and apis, K is the number of filters we have, and C is the previous layer’s C,
and input.c = 1)

The calculated output is: NCHW = 2,2,2,2

My question now becomes, as I pass this onto another conv layer, should I flatten the output of this
layer, from NCHW = 2,2,2,2 to NCHW=4,1,2,2 ?

I dont care much that the computed results I get are correct right now (of course I do, in the long run),
but I want the flow of dimensions across the layers to not have to suffer transformations because Im doing
something wrong somewhere.

As you can see I’m decently lost here.

Thank you for any tips.

Hi @prozak ,
I don’t think you have to flatten the tensor. You can just set the input shape of the next conv layer to be NCHW = 2,2,2,2.

Please let me know in case if you still face the issue.