Thank you for your reply.
However I faced another error in reshaping the tensor.
Can you share your training spec file?
You want to train a 736x416 detectnet_v3 model.
According to Transfer Learning Toolkit — Transfer Learning Toolkit 3.0 documentation
Please check if your images/labels are resized to 736x416 offline.
Reference: Tensor reshape error when evaluating a Detectnet_v2 model
Since you set the facedetectIR tlt model as the pretrained model, please resize your images/labels to 384x240 offline.
And set 380x240 in your training spec file.
https://ngc.nvidia.com/catalog/models/nvidia:tlt_facedetectir
Input
Gray Image whose values in RGB channels are the same. 384 X 240 X 3 (W x H x C) Channel Ordering of the Input: NCHW, where N = Batch Size, C = number of channels (3), H = Height of images (240), W = Width of the images (384)
Thank you Morganh for your support.
Concerning the load_key. I set it to tlt_encode. However I am still having issues even though I used the facedetect model not the IR.
What do you mean by “I used the facedetect model not the IR” ?
I am working on the facenet model (face detect) provided by the transfer learning toolkit.
For the pretrained model, there are 2 version either with IR or without from ngc.
So using the pretrained model without IR give me the OSError: Unable to open file. And using the pretrained model with IR give me the dimensions error. I will resize the images as you said. I will resize them offline 384x240 images and see if it will work.
However meanwhile, I am trying with the pretrained model without IR.
For https://ngc.nvidia.com/catalog/models/nvidia:tlt_facenet, according to its overview,
its key is
nvidia_tlt
Input
Grayscale Image whose values in RGB channels are the same. 736 X 416 X 3
For https://ngc.nvidia.com/catalog/models/nvidia:tlt_facedetectir, according to its overview,
its key is
tlt_encode
Input
Gray Image whose values in RGB channels are the same. 384 X 240 X 3 (W x H x C) Channel Ordering of the Input: NCHW, where N = Batch Size, C = number of channels (3), H = Height of images (240), W = Width of the images (384)
Ok thank you for your reply.
I just want to make sure if the dataset of the face detect is the same as the face detect with IR but resized.
Thank you so much !
It shouldn’t be since the label is different.
“This model accepts 384x240x3
dimension input tensors and outputs 24x15x4
bbox coordinate tensor and 24x15x1
class confidence tensor.”
Any remaining issue?
I converted the dataset into 384x240 images. However when I trained the model, another error occurred:
Please remove result folder and retry.
Reference: DetectNet v2 training error - "ValueError: The zipfile extracted was corrupt. Please check your key "
I don’t think I am following up with you.
I can’t remove the zip.python file.
Could you provide me with more details please?
I mean you can remove the experiment_dir_unpruned folder and retry.
Please back up it firstly.
Thank you so much for your continuous help it is working now.