I am trying to implement a Pooling layer which outputs two tensors, a pooling tensor and a pooling mask tensor, since I need the pooling mask (containing the max indices) as an input of an Upsample layer (which is a custom layer too). I am using the sample plugins as a reference, in particular the Face-Recognition example from https://github.com/AastaNV/Face-Recognition. I have an issue concerning the casting of the inputs in the enqueue function (IPlugin), in the example from AastaNV we can find:
int DataRoiLayer::enqueue(int batchSize, const void*const *inputs, void** outputs, void*, cudaStream_t stream)
{
float* bbox = (float*)inputs[1];
int srcSize[] {dimsData.c(), dimsData.h(), dimsData.w()};
int dstSize[] {dimsRoi.c(), dimsRoi.h(), dimsRoi.w()};
int roi[] = { int(bbox[0]+0.5), int(bbox[1]+0.5), int(bbox[2]+0.5), int(bbox[3]+0.5)}; //rounding
convertROI((float*)inputs[0], (float*)outputs[0], nullptr, srcSize, dstSize, roi, stream);
return 0;
}
So he/she casts the input to a float* and then simply treats bbox as a vector.
In my code I do the same:
int enqueue(int batchSize, const void*const *inputs, void** outputs, void*, cudaStream_t stream)
{CHECK(cudaThreadSynchronize());
float* bottom_data = (float*)inputs[0];
for (int idc=0; idc<3*360*480; ++idc){
cout<<"bottom_data_"<<idc<<" "<<(bottom_data[idc]) <<endl;
}
float* top_data = new float[out_h*out_w*channels_];
outputs[0] = &top_data[0];
float* top_mask = new float[out_h*out_w*channels_];
outputs[1] = &top_mask[0];
etc...
But although the input of the Pooling Layer is an image (I have simplified the model just to understand how the plugins work) I get random float numbers:
bottom_data_518262 1.34542e-37
bottom_data_518263 2.43194e+17
bottom_data_518264 -5.31794e+37
bottom_data_518265 2.91721e-39
bottom_data_518266 2.15447e-36
bottom_data_518267 4.61451e+15
bottom_data_518268 4.90864e-37
bottom_data_518269 2.4432e+17
bottom_data_518270 1.58035e-37
bottom_data_518271 2.43194e+17
bottom_data_518272 -5.31793e+37
bottom_data_518273 2.91587e-39
So I am not really reading the image but the memory slot assigned to the input pointer.
Any suggestions on how to proceed?