I wanted to get an idea regarding how does pruning affect the inference speed. Any insights would be really helpful. Like a pruned resnet 50 / 34 backend on an image of 512*512 run at 5 - 10 fps ?
Regards,
Yash
Normally, pruning will reduce the model size, increase inference speed. But normally the mAP will reduce after pruning. So, need to trigger retraining. More experiments are needed to find a better combination for mAP and FPS.