Thanks so much for your continuing help with this, Morganh.
Using the script I attached above I validated my training dataset that I assumed to have all images with 1024x768 resolution (the testing dataset already validated well showing all images at 1024x768 resolution). Surprisingly it turned out that there were multiple images in the training dataset that were not at 1024x768 resolution. For example:
Images found in /home/james/nvidia/tlt/experiments/tfrecords/training/trainval-fold-001-of-002-shard-00001-of-00010 with unexpected dimensions:
Image ID: image_2/016ae9bb1b4cc4be
Width: 1024
Height: 758
Image ID: image_2/45b2d5d14b97d6f5
Width: 1024
Height: 683
Image ID: image_2/00000901
Width: 375
Height: 281
Image ID: image_2/armas_1147
Width: 620
Height: 350
Image ID: image_2/armas_1671
Width: 500
Height: 375
Image ID: image_2/armas_2876
Width: 400
Height: 282
Image ID: image_2/armas_2169
Width: 160
Height: 120
...
So it looks like at some point I managed to use unresized images and corresponding KITTI files to create my TFRecords for input. This escaped my attention I guess because my understanding was that the model won’t train with 1) non-uniform inputs 2) not at a resolution with both width and height being multiples of 16.
I have regenerated the training dataset using images and KITTI files correctly sized to 1024x768. After training the model using this dataset I can now evaluate the model using tlt-evaluate and the reshape issue I was seeing has disappeared.
Can anyone comment as to why the model seems to have initially trained OK with input images at a resolution other than what is specified in the documentation:
DetectNet_v2
Input size: C * W * H (where C = 1 or 3, W > =480, H >=272 and W,H are mutliples 16)
Image format: JPG, JPEG, PNG
Label format: KITTI detection
Note: The tlt-train tool does not support training on images of multiple resolutions, or resizing images during training. All of the images must be resized offline to the final training size and the corresponding bounding boxes must be scaled accordingly.
In any event, this issue is resolved, it’s just not clear yet as to why it happened in the first place if the non-uniform/non-compliant sizing of the input images was in fact the root cause of the error.