Tensor reshape error when evaluating a Detectnet_v2 model

Thanks so much for your continuing help with this, Morganh.

Using the script I attached above I validated my training dataset that I assumed to have all images with 1024x768 resolution (the testing dataset already validated well showing all images at 1024x768 resolution). Surprisingly it turned out that there were multiple images in the training dataset that were not at 1024x768 resolution. For example:

Images found in /home/james/nvidia/tlt/experiments/tfrecords/training/trainval-fold-001-of-002-shard-00001-of-00010 with unexpected dimensions:
Image ID: image_2/016ae9bb1b4cc4be
    Width: 1024
    Height: 758
Image ID: image_2/45b2d5d14b97d6f5
    Width: 1024
    Height: 683
Image ID: image_2/00000901
    Width: 375
    Height: 281
Image ID: image_2/armas_1147
    Width: 620
    Height: 350
Image ID: image_2/armas_1671
    Width: 500
    Height: 375
Image ID: image_2/armas_2876
    Width: 400
    Height: 282
Image ID: image_2/armas_2169
    Width: 160
    Height: 120
 ...

So it looks like at some point I managed to use unresized images and corresponding KITTI files to create my TFRecords for input. This escaped my attention I guess because my understanding was that the model won’t train with 1) non-uniform inputs 2) not at a resolution with both width and height being multiples of 16.

I have regenerated the training dataset using images and KITTI files correctly sized to 1024x768. After training the model using this dataset I can now evaluate the model using tlt-evaluate and the reshape issue I was seeing has disappeared.

Can anyone comment as to why the model seems to have initially trained OK with input images at a resolution other than what is specified in the documentation:


DetectNet_v2

Input size: C * W * H (where C = 1 or 3, W > =480, H >=272 and W,H are mutliples 16)
Image format: JPG, JPEG, PNG
Label format: KITTI detection

Note: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.

In any event, this issue is resolved, it’s just not clear yet as to why it happened in the first place if the non-uniform/non-compliant sizing of the input images was in fact the root cause of the error.