" ValueError: Invalid model: /workspace/tao-experiments/detectnet_v2/pretrained_resnet50/pretrained_detectnet_v2_vresnet50/resnet50.hdf5, please check the key used to load the model
Okay, I will do that and get back, but this raises one other question:
“!tao detectnet_v2 train” is the first time one has to use the “-k” argument in this project, so on what basis is it saying that the model is invalid? The model was downloaded from the nVidia registry and that may have required my NGC API key, so does that mean that the resnet50.hdf5 is key-encrypted and therefore dictates the key that one needs to use (in effect, my NGC API key) for everything else that uses it?
“ValueError: Invalid model: /workspace/tao-experiments/detectnet_v2/pretrained_resnet50/pretrained_detectnet_v2_vresnet50/resnet50.hdf5, please check the key used to load the model”
what does “check the key” mean? I have used “123”, as you suggested above.
please advise
I carried out the step laid out in the jupyter notebook
“!ngc registry model download-version nvidia/tao/pretrained_detectnet_v2:resnet50
–dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet50”
As a sanity check, to be clear, you would like me to:
sudo rm -r the existing pretrained_detectnet_v2_vresnet50 folder in my LOCAL_EXPERIMENT_DIR
Run the cell “!ngc registry model download-version nvidia/tao/pretrained_detectnet_v2:resnet50
–dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet50”
and then run the 1 epoch experiment that you describe above
The new model you asked me to download is giving the same result:
" Invalid model: /workspace/tao-experiments/detectnet_v2/pretrained_resnet50/pretrained_detectnet_v2_vresnet50/resnet50.hdf5, please check the key used to load the model"
I now have resnet50 working in training, using 123 as -k.
The hdF5 downloads inside a folder called “pretrained resnet_50”, which downloaded inside an existing “pretrained resnet_50” folder. I also had to make a small change to the “pretrained_model_file” path in the train kitti.txt.
Finally, a number of folder names seem to vary between the documentation and the jupyter notebook, e.g SPECS_DIR is sometimes referred to LOCAL_SPEC_DIR and LOCAL_PROJECT_DIR is sometimes USER_PROJECT_DIR.
I will now complete the original 3 stage experiment:
Hi Morganh.
I have hit a small problem at the export stage:
I moved an old experiment_dir_final out of my LOCAL_EXPERIMENT_DIR and then ran the jupyter norebook cell.
The first line of code:
“!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_final” creates a new empty folder and then the notebook throws a PermissionError associated with this folder.
Following the advice and deleting "
,
“DockerOptions”:{
“user”: “1000:1000”
}
gets rid of the Permission issue, but the notebook throws a new error:
“ValueError: Cannot find input file name”
This apparently relates to the -k KEY (see “Can’t export the model to int8”
As you know from above I am using the key “123” as suggested and it is the same key that was used to train a 1 epoch test for this ‘quick experiment’, which is proving to be anything but quick.
This has indeed downloaded the latest version of tao-toolkit, however I am not sure where it downloaded to.
When “detectnet_v2 train” is run from inside the docker I get:
It appears that the syntax for “detectnet_v2 train” isn’t fully recognised inside the docker and it seems to be suggesting that I need to include the task that I want performed. Do I add a flag for “train”, despite explicitly using “train” after “detectnet_v2”?
If so, what is the syntax that I should use?
Thank you