Hello. I have converted the SSD Inception v2 network using the C++ example and now I am testing the network to see how it performs. And the results are worse compared to a plain Tensorflow model.
My preprocessing steps:
- Cut image into slices with some overlap
- Normalize the colors to [-1;1]
- Compile a batch by concatenating all images into one huge vector
I have also tried adding some blur since the images I am testing on are a bit compressed and the borders are too angled (has a stairwell), but that did not help much.
I am mainly interested in person identification. And the converted network does a bad job at detecting small
people. Any idea why?