I’ve started to play with TensorRT only recently and find some initial results baffling, probably I’m missing something obvious.
For testing purposes I’ve written a primitive (just one conv layer) model in TensorFlow and converted it to UFF. I then parse it using C++ API. The input has three channels, which I explicitly rearrange into NCHW format before feeding. The result is the same as after running the network in TensorFlow.
But I’ve noticed that neither the optimized model nor the inference result are affected by the ‘inputOrder’ parameter of IUffParser::registerInput. I can set it to UffInputOrder::kNCHW or UffInputOrder::kNHWC, the output stays the same.
I thought that this parameter determines whether automatic transposition is performed in the optimized model (like ‘conv1’ here: https://devblogs.nvidia.com/tensorrt-integration-speeds-tensorflow-inference), but apparently it’s not the case.
Could somebody clarify this for me?