Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Classification TF2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) v5.0.0
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

I want to understand how pruning works in these classification models like MobilenetV2 or EfficientNet_B0

I used threshold as 0.68. That means for a conv2d layer, atleast 68% filters layers will be removed in increments of granularity (8) while making sure atleast min filters (16) are retained. Is that right?

And if X filters are going to be removed per layer containing 32 filters, then the filters are sorted in the order of magnitude of L2 norm and top X filters out of 32 should be retained right?

My question is, when I export the trained model and the pruned model (before re-training) and compare them, the filters that are retained in the pruned version are not the top X necessarily. What I have observed is while most of them are, there are some towards the end that are nit supposed to be retained just based on the L2 norm I calculated, but have been retained. Is there any gap in my understanding? If no, what am I missing? Does it have anything to do with batch normalization?

But when I try to test that out for dense and pruned model, I see that the filters being kept are not necessarily the TopK based on L2 norm, which is what has me confused. Any insights on that would be helpful

My process -

Export dense model and pruned model to ONNX and save the weights for the same Conv2D layer.

By looking at the biases retained, figure out which filters have been removed and make a list of the indices.

Arrange indices np.argsort(np.sqrt(np.sum(conv2D_weights**2, axis=tuple(range(1,conv2D_weights.ndim))))) and check if the topK correspond to the actual list of indices. And it doesn’t.

I can also upload the models if necessary, but I have tried it with Efficientnet B0 and MobilenetV2 on PASCAL VOC pruned with 0.68 threshold and default pruning params elsewhere

Yes, could you please share the models? And also it is appreciated that you can share more detailed steps for your finding. For example, step2, the result when you “make a list of the indices”. Thanks.

Now I visualize the ONNX model using Netron and try to see which filter indices have been retained after pruning. I used 0.68 as the pruning threshold.

.
Comparing these two, the filter indices retained are 3,4,5,6,7,8,13,14,17,18,19,20,21,22,25,28

After that, I save the weights of these models are dense.npy and prune.npy, because I read here that only weights are used to calculate the L2 norm for the filters. dense.npy (3.5 KB) prune.npy (1.8 KB)

Now when I calculate the norm of the filters using the same formula as the one in the codebase, I arrange the indices of the filters based on it.

The Top 16 in this case, aren’t the same as the list of actual filters retained. 15,0,27 should’ve been retained but 3,22,19 are instead. I am trying to understand what is happening in this case.

I added a print statement to print the indices and the indices retained are 3,4,5,6,7,8,13,14,17,18,19,20,21,22,25,28, which are the exact same indices as I can see in the Netron visualization of the ONNX version of the model

L313 is still getting the indices during explored stage. L728 will get the retained idx when the explored stage is done.
When print L738, the indices matches the idx that I see in the netron.