Performance difference between various tlt models while deploying on deepstream.

neophyte1 · February 17, 2020, 6:14am

I trained a 5 class object detector using TLT. As default deepstream-reference app uses a detectnetv2 resnet10 model, I used the same architecture for TLT training. The issue is I am not getting good results (in terms of accuracy). So I want to switch to some other network like (ssd). Is there any performace chart or plot ( in terms of speed and accuracy ) comparing all the supported TLT architectures when deployed via deepstream on nano ? Also, what kind of performance degradation should I expect when deploying my own custom models ( other than TLT ).

Morganh · February 17, 2020, 7:16am

Hi neophyte1,
Since it is related to TLT, I move this topic from DS forum into TLT forum.

Morganh · February 17, 2020, 9:19am

Hi neophyte1,
Different dataset have different results for the same TLT network.

For KITTI dataset,
The TLT detetnet_v2 network has a better mAP result. It is around 80% mAP.

For COCO dataset(resizing to 300x300),
As far as I known, TLT SSD network has a better mAP result.As of now, I can get around 40%mAP @IOU 0.5 for 80 classes.

And regarding to fps, each kind of TLT network needs pruning.
A better way is that try prune less -->retrain → prune → retrain → prune → retrain to find a good balance between mAP and FPS.

For your case, I need to reproduce your experiment with COCO 5 classes data and TLT detectnet_v2.
Your experiments: COCO data, 5 classes(person,bicycle,motorbike,bus,car), 512x512
Correct me if any.Thanks.

neophyte1 · February 17, 2020, 9:32am

Hi Morganh,
By performance, I don’t just mean accuracy but speed also.(hence, not dependent on a particular dataset)

“And regarding to fps, each kind of TLT network needs pruning.
A better way is that try prune less -->retrain → prune → retrain → prune → retrain to find a good balance between mAP and FPS.”

But from my understanding, there is a limit to the amount of pruning one can do while maintaing a respectable accuracy when compared to an unpruned model. Now, my fear is if I take too heavy a network, even though I might get accuracy, I might not get good fps on nano even after pruning and if I do too much pruning on that network, then accuracy might suffer. Since, the deepstream reference app comes with detectnetv2 architecture(resnet 10) and it works great both in terms of accuracy and speed, I have tried till now to base my training on it( reason for my aversion towards SSD). Hence, I wanted to know about any prior experiments done with other networks on nano to get a better idea of which networks might work( both speed and accuracy ) .

“For your case, I need to reproduce your experiment with COCO 5 classes data and TLT detectnet_v2.
Your experiments: COCO data, 5 classes(person,bicycle,motorbike,bus,car), 512x512”

Yes, more or less my experimental setup is similar except there are some added custom augmentations.

As from our previous discussions, I am seeing way too much false positives majorly for person class ( of quite large sizes and high confidence ).

Thanks for your help.

Morganh · February 17, 2020, 9:56am

I can share my experience for pruning on TLT detectnet_v2. For KITTI dataset or VOC dataset, even though I do a one-off heavy pruning, the mAP can keep similar to unpruned. You can trigger experiments to confirm it.
So, the important thing for detectnet_v2 is to generate a better mAP model.

More, I am not sure why you set COCO dataset to 512x512. Is it a requirement?
Smaller size, less grids(512/16 * 512/16) for detectnet_v2. That may hurt mAP.

neophyte1 · February 17, 2020, 10:43am

Hi Morganh,
Thanks for the update. Kindly share your experience with other datasets also. It would be quite helpful and I might get some insight into the whole training process.

As mentioned before, mAP is not my primary concern. From my past experiences ( other than TLT), I have found that good mAP on a dataset doesn’t necessarily translate to good real life detections. I am using COCO as it’s a difficult dataset to train on and even though the mAP generally comes as lower than other datasets, the real world inferences tend to be good.

No, 512x512 is not a requirement. I chose this particular size for the same reason as to why I am not going for other bigger architectures. It’s large enough to get good mAP at the same time it’s small enough that when I deploy on nano the performance will not get hurt by much.

I might go for other input sizes if it gives a good accuracy vs speed tradeoff.

neophyte1 · February 17, 2020, 10:47am

Another thing which I would like to know, does tlt generate negative samples internally while training ?

Morganh · February 17, 2020, 3:18pm

No,tlt does not generate negative samples internally while training.

Topic		Replies	Views
Transfer Learning toolkit models vs Deepstream models on the Nano TAO Toolkit	9	1912	October 12, 2021
Reduced Accuracy when importing SSD model from TLT into DeepStream DeepStream SDK	4	539	October 12, 2021
TLT trained model accuracy worse after deployment TAO Toolkit	11	835	October 12, 2021
Deepstream v5 unexpected realtime results of models TAO Toolkit	2	407	October 12, 2021
TAO Toolkit efficientnet B4 and efficientnet B5 fps TAO Toolkit deepstream	13	1167	February 1, 2022
Resnet10 "primary detector" DeepStream SDK	17	5237	October 12, 2021
Weak performance SDD ,detect person TLT training TAO Toolkit	20	830	August 16, 2021
Too many false positives. TAO Toolkit	33	2318	October 12, 2021
Tao yolov4 pruned model is stuck at 6.5 FPS TAO Toolkit	18	52	September 3, 2024
Dataset used for training sample models TAO Toolkit	6	859	October 12, 2021

Performance difference between various tlt models while deploying on deepstream.

Related topics