Prune rate always 1 for whatever pth value

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
Ubuntu 20, x86, RTX3090
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
Detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

I’ve retrained a model by my own small dataset with training epoch 40 and batch size 4, the average precision shows all good in training and evaluate stage, when comes to prune stage, what ever I put the value in pth (tried default 5.2e–6, 7.0e–6, 7.2e–6, 9.9e–6, 0.01), it always show pruned result with ratio 1.
I’m sure it works days ago with another similar custom model, no idea what’s going on here:

!tao detectnet_v2 prune \
-m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \
-o $USER_EXPERIMENT_DIR/experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt \
-eq union \
-pth 0.0000070 \
-k $KEY

2022-01-24 13:23:22,611 [INFO] root: Registry: [‘nvcr.io’]
2022-01-24 13:23:22,639 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.4-py3
2022-01-24 13:23:22,648 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/ewew/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2022-01-24 05:23:27,161 [INFO] modulus.pruning.pruning: Exploring graph for retainable indices
2022-01-24 05:23:27,436 [INFO] modulus.pruning.pruning: Pruning model and appending pruned nodes to new graph
2022-01-24 05:23:34,049 [INFO] iva.common.magnet_prune: Pruning ratio (pruned model / original model): 1.0
2022-01-24 05:23:34,463 [INFO] root: Pruning ratio (pruned model / original model): 1.0
2022-01-24 05:23:34,463 [INFO] root: {
“pruning_ratio”: 1.0,
“size”: 42.937171936035156,
“param_count”: 11.203023
}
2022-01-24 13:23:35,091 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Could you please try more pth values?
For example, 5.2e–7 ,5.2e–5, etc.

tried 0.00000052, 0.000052, still the same, ratio still 1.0.
tried 0.8, 0.9 and 1.0, the ratio updated range from 0.66 to 0.22, it works.

I’m confused, it works with 0.0000072 days ago.

Are you using the same backbone days ago?

yes always.
does the training epoch and batch size impact the prune rate setting?

No, it does not.
Not sure if you change something as of now.

I’m quite sure I didn’t change a thing.
Does my current pth make sense as the prune rate seems correct though it’s way too big from recommend values?

Forget about the recommend pth value. Please just check the pruning ratio and the trainable parameters in the training log.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.