I’m running TrafficCamNet model on a Jetson nano with python and TensorRT without DeepStream. First I needed to convert the model so I follow same steps in this thread.
To convert the model with TAO instead of TLT:
tao-converter resnet18_trafficcamnet_pruned.etlt -k tlt_encode -c trafficnet_int8.txt -o output_cov/Sigmoid,output_bbox/BiasAdd -d 3,544,960 -i nchw -e trafficnet_int8.engine -m 1 -t int8 -b 1
Then I run the model with one image and get the results. I’m trying to understand the output. In the model page says that the output of the model is
60x34x16 bbox coordinate tensor and
60x34x4 class confidence tensor.
My output is ok (I think) as I got a flattened 60x34x16 array and a flattened 60x34x4 array (with values like this):
In order to understand this output I read this comment and say:
The model has the following two outputs:
- output_cov/Sigmoid : [batchSize, Class_Num, gridcell_h, gridcell_w]
- output_bbox/BiasAdd : a [batchSize, Class_Num, 4]
How can I extract this data from the output I get? How can I divide the flattened array to extract this? What means
batchSize if I only use 1 image?
I know that I have to pass this data to a NMS algorithm but I have to pass it without normalization and the bboxes/confidence in order to get a result from it, but need to organize this bboxes and confidences.
TensorRT Version: 22.214.171.124
GPU Type: Jetson nano 2G
Python Version: 3.6.9