Tlt-converter -i flag purpose

Hi,

I was wondering about the -i flag of tlt-converter. The help states:

-i input dimension ordering -- nchw, nhwc, nc (default nchw)

However setting nchw vs nhwc does not seem to make any difference. If the converted model (e.g TrafficCamNet) was trained on nchw, it will still expect that ordering as input, even if nhwc was set as the input flag.

I wondered if the flag is meant to be a way to automatically transpose the input image. Basically as a convenience function. Right now one has to transpose the image beforehand by hand.

As an extension to the question: Is there a way to retrofit layers to engines? I.e. adding a shuffle layer in front of the network to transpose actual nhwc input to the correct nchw order.

Best
Tobias

No, the tlt-converter will not do that.

Please refer to How to run tlt-converter
For example, when you train a trafficCamnet model by setting below in the training spec,

output_image_width: 960
output_image_height: 544
output_image_channel: 3

then please set -d 3,544,960 -i nchw in the command line of tlt-converter.

Right, that’s also what I observed. I was just wondering if the setting after -i is essentially a no-op. Let’s consider one command from the post you linked.

In the original version, the ordering is nchw:

$ ./tlt-converter resnet18_trafficcamnet_pruned.etlt -k tlt_encode -c trafficnet_int8.txt -o output_cov/Sigmoid,output_bbox/BiasAdd -d 3,544,960 -i nchw -e trafficnet_int8.engine -m $MAX_BATCH_SIZE -t $INFERENCE_PRECISION -b $BATCH_SIZE

If I modify the command to use nhwc, the produced engine file expects the same input ordering as before:

$ ./tlt-converter resnet18_trafficcamnet_pruned.etlt -k tlt_encode -c trafficnet_int8.txt -o output_cov/Sigmoid,output_bbox/BiasAdd -d 3,544,960 -i nhwc -e trafficnet_int8.engine -m $MAX_BATCH_SIZE -t $INFERENCE_PRECISION -b $BATCH_SIZE

So since both engines can be consumed in the same way, my conclusion was that -i has no effect. As far as I can tell there is no long form online documentation for tlt-converter, so I asked here.

More details is in DetectNet_v2 — Transfer Learning Toolkit 3.0 documentation

Thank you for the reference, that is indeed useful :) My google-fu is weak these days…

After reading the document I still wonder what behavior the -i argument is supposed to change. Is it correct that it has no effect at all? Could you outline a case where setting this flag correctly is crucial?

Not to be pedantic, TLT works very well for us. I would only like to illuminate dark corners in my knowledge to avoid struggling later.

For -i , in detectnet_v2, you can omit this argument.

Will modify document to avoid confusion.

Document is already modified.
See DetectNet_v2 — Transfer Learning Toolkit 3.0 documentation