Tlt-infer using a pretrained hdf5 model

Hi,
I am trying to infer detectenet_v2_vresnet10 on my own dataset. I downloaded the network from ngc. Contrary to peoplenet for instance, this file is a hdf5 file. Can I use it directly on nvinfer?
I get the following error

020-06-05 19:01:31,272 [INFO] iva.detectnet_v2.scripts.inference: Overlain images will be saved in the output path.
2020-06-05 19:01:31,272 [INFO] iva.detectnet_v2.inferencer.build_inferencer: Constructing inferencer
2020-06-05 19:01:31,665 [INFO] iva.detectnet_v2.inferencer.tlt_inferencer: Loading model from /pretrained_models/tlt_pretrained_detectnet_v2_vresnet10/resnet10.hdf5:
Traceback (most recent call last):
  File "/usr/local/bin/tlt-infer", line 8, in <module>
    sys.exit(main())
  File "./common/magnet_infer.py", line 56, in main
  File "./detectnet_v2/scripts/inference.py", line 194, in main
  File "./detectnet_v2/scripts/inference.py", line 117, in inference_wrapper_batch
  File "./detectnet_v2/inferencer/tlt_inferencer.py", line 110, in network_init
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/network.py", line 358, in get_layer
    raise ValueError('No such layer: ' + name)
ValueError: No such layer: output_cov

My config file is the following

inferencer_config{
 # defining target class names for the experiment.
 # Note: This must be mentioned in order of the networks classes.
 target_classes: "Car"
 target_classes: "Bycicle"
 target_classes: "Person"
 target_classes: "RoadSign"
 # Inference dimensions.
 image_width: 640
 image_height: 368
 # Must match what the model was trained for.
 image_channels: 3
 batch_size: 4
 gpu_index: 0
 # model handler config
 tlt_config{
   model: "/pretrained_models/tlt_pretrained_detectnet_v2_vresnet10/resnet10.hdf5"
 }
}
bbox_handler_config{
 kitti_dump: true
 disable_overlay: false
 overlay_linewidth: 2
 classwise_bbox_handler_config{
         key:"Car"
         value: {
                 confidence_model: "aggregate_cov"
                 output_map: "person"
                 confidence_threshold: 0.2
                 bbox_color{
                         R: 0
                         G: 255
                         B: 0
                 }
                 clustering_config{
                         coverage_threshold: 0.00
                         dbscan_eps: 0.7
                         dbscan_min_samples: 0.05
                         minimum_bounding_box_height: 4
                 }
         }
   }
 classwise_bbox_handler_config{
         key:"Bycicle"
         value: {
                 confidence_model: "aggregate_cov"
                 output_map: "person"
                 confidence_threshold: 0.2
                 bbox_color{
                         R: 0
                         G: 255
                         B: 0
                 }
                 clustering_config{
                         coverage_threshold: 0.00
                         dbscan_eps: 0.7
                         dbscan_min_samples: 0.05
                         minimum_bounding_box_height: 4
                 }
         }
        }
 classwise_bbox_handler_config{
         key:"Person"
         value: {
                 confidence_model: "aggregate_cov"
                 output_map: "person"
                 confidence_threshold: 0.2
                 bbox_color{
                         R: 0
                         G: 255
                         B: 0
                 }
                 clustering_config{
                         coverage_threshold: 0.00
                         dbscan_eps: 0.7
                         dbscan_min_samples: 0.05
                         minimum_bounding_box_height: 4
                 }
         }
        }
 classwise_bbox_handler_config{
         key:"Roadsign"
         value: {
                 confidence_model: "aggregate_cov"
                 output_map: "person"
                 confidence_threshold: 0.2
                 bbox_color{
                         R: 0
                         G: 255
                         B: 0
                 }
                 clustering_config{
                         coverage_threshold: 0.00
                         dbscan_eps: 0.7
                         dbscan_min_samples: 0.05
                         minimum_bounding_box_height: 4
                 }
         }
      }
}

The ngc pretrained hdf5 file is not compatible with tlt-infer.

Thanks for your answer. Is there a way to achieve the same purpose in a different way?
Can I down download the pretrained detectnet resnet10 model in another format compatible with tlt-infer? Or can use tlt-convert to convert the network in a .tlt?

The ngc hdf5 file can be set as pretrained model. Trigger training and then will get a tlt model.
Then it is compatible with tlt-infer.
Or you can find the pruned version of peoplenet model in https://ngc.nvidia.com/catalog/models/nvidia:tlt_peoplenet/files. It is a tlt model for peoplenet. The tlt-infer can run with it.

sounds good. thanks

If I actually finetune ngc hdf5 network and use .tlt model.step-0.tlt file, is the last layer pretrained (layer that depends on the number of output classes. Or is tlt-train dropping the last layer (to accommodate for a potential different number of classes)?

In TLT process, the ngc hdf5 works as pretrained model only. Sorry, I do not quite understand your requirement here.

When using tlt-train and using the pretrained network, I can still freely specify how many output classes I want the model to predict. The very last layer of the network will be different depending on this number of output classes.
My question is the following : how are the last layer weights initialized?

See “pretrained model file” explanation in user guide.

This parameter defines the path to a pretrained tlt model file. If the load_graph flag is set to False, it is assumed that only the weights of the pretrained model file is to be used. In this case, TLT train constructs the feature extractor graph in the experiment and loads the weights from the pretrained model file whose layer names match. Thus, transfer learning across different resolutions and domains are supported.

For layers that may be absent in the pretrained model, the tool initializes them with random weights and skips import for that layer.

The pruned version of model comes with etlt extension, when run with tlt-infer it is not getting accepted.

Let me clarify.

  1. The pruned model is a tlt format model.
  2. If your run tlt-export against a tlt format model, then an etlt model is generated.
  3. In detectnet_v2, tlt-infer can run inference against tlt format model.
    See spec (detectnet_v2_inference_kitti_tlt.txt).
    and also tlt-infer can run inference against trt engine.(trt engine can be generated via etlt model.)