Knowledge Distillation on backbone models

Hello everyone.
I’ve been using SSD networks for object identification tasks for some time now. This type of architecture has a network as a backbone (eg: Mobilenet). My question is: is it possible to apply the Knowledge Distillation technique on the Backbone network and then implement it as a base network in the SSD class? Do you think there will be problems? Thank you.


It should be possible.

Since Nano’has limited resources, it’s more recommended to apply the training on a desktop environment.
You can then copy the output model to Jetson for inferencing.

As a result, you can leverage the implementation of general desktop for the knowledge distillation.


Thanks for the reply @AastaLLL .

Yes, my idea is to create a student model to replace the base_net (the teacher model) present in the script. Of course, I will only use the Jetson Nano to do the full SSD model inference.
I hope not to run into various problems.

Thank you.


Ideally, you should be able to train a student model with the implementation like below:

And the output model will have a similar architecture as the teacher SSD MobileNet.
So you can deploy it on Jetson.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.