Hello,
I’m using TensorRT 4 python API to run a Caffe FCN for semantic segmentation. This FCN gets an image and produces N (in my case 3) segmentation maps of the same input size.
In my case I fix the input image to 512x512. So in my prototxt I have:
layer {
name: "input"
type: "Input"
top: "data"
input_param {
shape { dim: 1 dim: 3 dim: 512 dim: 512 }
}
}
In my python TensorRT code I preprocess the image to have simensions CXHxW and BGR ordered color channels.
im = im.resize((512, 512), Image.ANTIALIAS)
im = np.array(im, dtype=np.float32)
im = im[:, :, ::-1]
im = im.transpose((2, 0, 1))
I create the input to the net as:
d_input = cuda.mem_alloc(1 * im.size * im.dtype.itemsize)
cuda.memcpy_htod_async(d_input, np.ascontiguousarray(im), stream)
And get the output as:
d_output = cuda.mem_alloc(1 * im.size * output.dtype.itemsize)
cuda.memcpy_dtoh_async(output, d_output, stream)
To create the segmentation maps from the resulting 1d array I have tried:
output_3d = output.reshape((3,512,512)).transpose((1,2,0))
But I cannot get the correct segmentation maps, I only get noise which some pattern I cannot understand.
How is the output of TensorRT flattened when outputs of Caffemodel al 3d?
Thanks