Detectnet_V2 Training Configuration (nVidia TLT-NGC)

Hi guys!

I am currently training a custom object detection model using resnet-18 and detectnet_v2 for character detection with images size of 842 x 320 pixel.

I inspected that the original detectnet has this one specification configuration:

preprocessing {
	output_image_width: 1248
	output_image_height: 384
	min_bbox_width: 1.0
	min_bbox_height: 1.0
	output_image_channel: 3
}

Then, I tried to blindly change that preprocessing configuration into this one:

preprocessing {
	output_image_width: 832
	output_image_height: 320
	min_bbox_width: 1.0
	min_bbox_height: 1.0
	output_image_channel: 3
}

, so it could fit our custom dataset.

Is it allowed to train the detectnet_v2 this way, by changing the preprocessing part only?

Or should I use 1248 x 384 pixel and change our images size and its bounding box to match that 1248 x 384 with padding?

Thank you. I would appreciate any ideas.

Best regards,
Jeff

Hi jefflgaol,
You need not to use 1248 x 384 pixel.
Yes, you can change the (output_image_width, output_image_height) to (832,320) or (848,320).

Thank you very much.

Best regards,
Jeff