NGC Container Networks preprocessing

Is there any comprehensive documentation on preprocessing for the NGC models? E.g. the TrafficCamNet model page simply states:

RGB Image 960 X 544 X 3 (W x H x C)

I think this can be rather misleading for several reasons:

For example there is an official post by @Morganh about using tlt-converter to get engines from .etlt files from NGC. That post uses CHW format, which contradicts the NGC. You could argue that the NGC docs use column major format, however that is also not stated.

Further the input is not simply a raw RGB image. It needs to be normalized according via the ImageNet pixel statistics. There are several conventions, and buried in the forums @Morganh posted about the correct one. However that post only includes a reference to third party libraries, and no direct info about the intended way.

Due to not being able to find explicit documentation, I am now reverse engineering the jetson-inference GitHub repo by @dusty_nv. He implemented some CUDA kernels to preprocess the input correctly. Those seem to be for an earlier version of DetectNet though, so I am not completely confident.

As you know deep learning models won’t throw any errors if the inference image statistics do not match with the training statistics. They will simply cease to work, or even worse, perform rather bad. Thus I believe it is paramount to have documentation that states the preprocessing steps exactly.

For reference, I am implementing object detection as part of a larger C++ computer vision pipeline via TensorRT, and do not have access to third party libraries.

Hi @tobias.fischer1
Firstly, the topic (Inferring resnet18 classification etlt model with python - #41 by Morganh ) is talking about classification model. So it is not related to TrafficCamnet.
Trafficamnet is based on TLT detectnet_v2.
So , its preprocessing is the same as detectnet_v2.
See reference Run PeopleNet with tensorrt - #6 by Morganh

Thank you, that is the information I needed :) So the input pixel values for detectnet_v2 must be normalized between 0 and 1. You are right, the other post I linked does not apply here. I still believe it would be beneficial to state this in the model documentation, there seem to be lots of people struggling about this.