Description
I’m running TrafficCamNet model on a Jetson nano with python and TensorRT without DeepStream. First I needed to convert the model so I follow same steps in this thread.
To convert the model with TAO instead of TLT:
tao-converter resnet18_trafficcamnet_pruned.etlt -k tlt_encode -c trafficnet_int8.txt -o output_cov/Sigmoid,output_bbox/BiasAdd -d 3,544,960 -i nchw -e trafficnet_int8.engine -m 1 -t int8 -b 1
Then I run the model with one image and get the results. I’m trying to understand the output. In the model page says that the output of the model is 60x34x16
bbox coordinate tensor and 60x34x4
class confidence tensor.
My output is ok (I think) as I got a flattened 60x34x16 array and a flattened 60x34x4 array (with values like this):
In order to understand this output I read this comment and say:
The model has the following two outputs:
- output_cov/Sigmoid : [batchSize, Class_Num, gridcell_h, gridcell_w]
- output_bbox/BiasAdd : a [batchSize, Class_Num, 4]
How can I extract this data from the output I get? How can I divide the flattened array to extract this? What means batchSize
if I only use 1 image?
I know that I have to pass this data to a NMS algorithm but I have to pass it without normalization and the bboxes/confidence in order to get a result from it, but need to organize this bboxes and confidences.
Environment
TensorRT Version: 8.0.1.6
GPU Type: Jetson nano 2G
Python Version: 3.6.9