Hi @Morganh ,
Thank you for your suggestions. I made the recommended changes to the spec file, but I’m still encountering the same issues. Kindly see my updates and follow-up questions below:
1. Invalid Detections Persist After Applying Suggested Config Changes
I’ve updated the preprocessing
and evaluation_box_config
sections as recommended:
preprocessing {
output_image_width: 832
output_image_height: 272
min_bbox_width: 1.0
min_bbox_height: 1.0
output_image_channel: 3
enable_auto_resize: true
}
evaluation_box_config {
key: "head"
value {
minimum_height: 4
maximum_height: 9999
minimum_width: 4
maximum_width: 9999
}
}
Despite these updates, I’m still seeing invalid or incorrect detections during inference — even with a very low confidence threshold (e.g., 0.0001).
2. Compatibility and Model Conversion Issues with TAO Toolkit 4.0.1
I tried training the same setup using tao-toolkit:4.0.1-tf1.15.5
. Here’s what I observed:
- I used a .hdf5 pretrained model as a starting point, since the training outputs are still in .tlt format.
- After training, I exported the
.tlt
to .etlt
using this command:
tao detectnet_v2 export \
-m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \
-e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
-o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
-k my_custom_key \
--data_type fp16 \
--batch_size 8 \
--gen_ds_config
However, I was unable to generate the engine file from the .etlt
inside the DeepStream pipeline.
Then I converted it to TRT engine using:
tao-deploy detectnet_v2 gen_trt_engine \
-m $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
-k my_custom_key \
--data_type int8 \
--batches 10 \
--batch_size 8 \
--max_batch_size 64 \
--engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 \
--cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \
-e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
--verbose
Issue: When I run this .trt
engine with DeepStream, it results in a “core dumped” error.
My DeepStream config looks like this:
[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
model-engine-file=resnet18_detector.trt.int8
labelfile-path=labels_face.txt
batch-size=1
network-mode=1
num-detected-classes=1
interval=0
gie-unique-id=1
process-mode=1
network-type=0
infer-dims=3;272;832
[class-attrs-all]
nms-iou-threshold=0.4
pre-cluster-threshold=0.2
topk=50
Questions:
- Can we use an
.hdf5
pretrained model directly in TAO 4.0.x for training and exporting? Or is this incompatible with the current DetectNetV2 export pipeline?
- Is there any additional step required when converting from
.tlt
to .etlt
— or any known issues that could prevent engine file generation from .etlt
in DeepStream?
- Could you suggest any debug steps for the DeepStream crash when loading the
.trt
engine?
Any further guidance on debugging this or ensuring proper compatibility between TAO versions and DeepStream would be much appreciated.
Thank you again for your support!