I have been attempting to get 16b images to work with the current Maxine SDK. Starting point I used was the VideoEffectsApp example code that was given with the SDK.
I am able to compile and run the example, upresing 8 bit images just fine for png, jpg, etc formats. And i must say the results are great for those, as they were with the NGK examples.
However, for our needs we need higher bit depth than the 8b which the app currently seems limited to. When I tried to upres 16b or 32b images, it was not working and often crashed with errors such as…
Error: invalid pitch argument
or
Error: The specified pixel format is not accommodated
First thing did was make changes to the app to allow it to include tifs and exr’s images formats, as they are the common image formats to use for 32b and 16b images and both supported by the opencv libraries. Specifically float and half float 3 or 4 channel images respectively.
I also modified the cv::imread as follows…
_srcImg = cv::imread(inFile, cv::IMREAD_ANYCOLOR | cv::IMREAD_ANYDEPTH);
Otherwise the opencv libraries will convert 16b images to 8b, etc. I confirmed that at this point the images are coming in correctly(ie as current 32b and 16b image formats) and that all the format were setup correctly, and if I display those images they are displaying fine after being loaded into the app.
However when the code then gets into the NvCVImage* routines to setup the images, it gets into trouble. Immediately giving the errors I listed above.
From what can see I am pretty certain the code in the routines alike NvCVImageTransfer, and NvCVImage_Alloc are not handling 32b or 16b formats correctly, despite the documentation saying it could handle them.
First thing I noticed was the pitch seems to be incorrect after the allocation routines in the setup. The pitch for the resulting NvCVImage seems to represent the number of pixels across for a col, not the byte stride/pitch count as the docs say it should.
Also it appears to me that despite the format when the images are coming in as 8bit, and the final format supposedly being NVCV_F32 and NVCV_BGR. The algorithms are instead treating the 3/4 components of an 8u image as a single floating point number it is then building the model off of. I mean, it is a bit of a black box after that point for me, so this is just based on things like the incorrect pitch being set, and the fact that it does indeed seem like it can’t handle true floating point 3 or 4 channel images.
I am at a bit of a loss at this point how to debug any further. It is really a shame as we had planned to incorporate these into our pipelines heavily.
I really would love to see how 16b or 32b are suppose to work if indeed they are supported as the documentation suggests. As far as I can tell even the 8b is not actually building models correctly, and just might be a bit of a fluke it is working at all, despite 3/4 8u pixel components being treated as a single float by the model instead of say 3/4 floats scaled to 1/255.0. etc. Which could mean the model could work better then it is if that is truly what is happening…
Thanks for any help. Definitely appreciate the help. Sorry for the long message and the amount of detail, but I honestly believe there is something broken in the current setup for MAXINE, and the only way I could think of conveying that is to be detailed in what I found. So apologize the long post.
Clint Hanson
Vancouver, BC