Hi
I’m able to run detectnet_v2 without any issue from within conteiner (without tlt wrapper).
Tried to follow the bellow tutorial to train PeopleNet model, as we need person detector and have relativly small amount of data, so perer to start from PeopleNet and not detectnet_v2.
https://developer.nvidia.com/blog/training-custom-pretrained-models-using-tlt/
For detectnet_v2, ngc has resnet34.hdf5, which is in hdf5, but for PeopleNet it’s tlt: tlt_peoplenet_vunpruned_v2.1/resnet34_peoplenet.tlt. Tried v2.1/v2.0, unprunned/prunned with same result of falure.
Command for training inside the container (spec file attached):
detectnet_v2 train -e /workspace/detectnet_v2/specs/peoplenet_train_resnet34_person_kitti.txt -r /workspace/data/detectnet_v2/experiment_dir_unpruned_resnet34_person -k tlt_encode -n peoplenet_resnet34_detector --gpus 1
Tried adding “load_graph: true” to model_config as in retrain config, that acceps .tll, but no luck. got other errors.
tlt.log (62.8 KB)
peoplenet_train_resnet34_person_kitti.txt (2.8 KB)