Pruning Dino model

rishikesan · December 19, 2023, 6:17am

Please provide the following information when requesting support.

• Hardware (T4)
• Network Type (Dino)

Hi i converted Dino model to FP32 , but the inference speed with batch size 1 is not satisfactory
I want to try some optimization technique using tao_toolkit

How can i do that ,
I was looking for pruning but there is not much good reference on how to do it for Dino

Can you please advice on this

Morganh · December 19, 2023, 6:48am

Currently for DINO network, it does not support pruning yet.
It may be implemented in future.
There are other examples for transformer-based network. Such as, TAO5.2 version’s OCRNet and OCDNet. In OCRNet, you can refer to https://github.com/NVIDIA/tao_pytorch_backend/blob/99e0a38a0d3ac00997c41c7e6ea6f02c6586bf4f/nvidia_tao_pytorch/cv/ocrnet/scripts/prune.py.

rishikesan · December 19, 2023, 7:23am

Thanks for the information,

Is there any way i can improve the speed on Dino

I tried the dino inference with Dino_fan_large fp32 and fp16 with batch size as 1 , can i have higher batch size like 2,4 (FYI: i am running on single T4 GPU) and improve the inference speed

Morganh · December 19, 2023, 7:33am

You can train a model with smaller backbone, for example, fan_tiny, resnet_50.
For DINO, it supports below backbones.

More can be found in DINO - NVIDIA Docs

From Overview - NVIDIA Docs, in T4, the resnet_50 is quicker.

rishikesan · December 21, 2023, 8:52am

Thanks for the suggestions on different models , yes ResNet is faster , but I thought it is good to have the same performance of Dino FAN-L FP32 (even the FP16 performs little bad than expected )

Currently i am trying to make use of the following reference to implement a pruning step for Dino , so that i can have FP32 with improved inference speed

The pruning general method

Pruning example from OCRNET

github.com

NVIDIA/tao_pytorch_backend/blob/99e0a38a0d3ac00997c41c7e6ea6f02c6586bf4f/nvidia_tao_pytorch/cv/ocrnet/scripts/prune.py

# Copyright (c) 2023, NVIDIA CORPORATION.  All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
Prune OCRNet script.
"""
import os
import argparse

This file has been truncated. show original

Some Dino PTL module usage

github.com

NVIDIA/tao_pytorch_backend/blob/99e0a38a0d3ac00997c41c7e6ea6f02c6586bf4f/nvidia_tao_pytorch/cv/dino/scripts/inference.py

# Copyright (c) 2023, NVIDIA CORPORATION.  All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

""" Inference on single patch. """
import os

from pytorch_lightning import Trainer

import nvidia_tao_pytorch.core.loggers.api_logging as status_logging

This file has been truncated. show original

The one i have drafted is below

#################################################
# Torch Pruning (Begin)
#################################################
import torch_pruning as tp
import sys
import os

current_dir = os.getcwd()
tao_base = os.path.abspath(os.path.join(current_dir, "../../../../"))
sys.path.append(tao_base)

from nvidia_tao_pytorch.core.hydra.hydra_runner import hydra_runner
from nvidia_tao_pytorch.cv.dino.model.pl_dino_model import DINOPlModel
from nvidia_tao_pytorch.cv.dino.config.default_config import ExperimentConfig
import torch
from nvidia_tao_pytorch.core.utilities import update_results_dir

prune_ratio = 0.2
granularity = 8

model_path = "dino_fan_large_imagenet22k_36ep.pth"
# cfg = ExperimentConfig

# Initialize Hydra
spec_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
@hydra_runner(
    config_path=os.path.join(spec_root, "experiment_specs"), config_name="infer", schema=ExperimentConfig
)

def main(cfg: ExperimentConfig) -> None:


    cfg = update_results_dir(cfg, task="inference")
    model = DINOPlModel.load_from_checkpoint(model_path,
                                                    map_location="cpu",
                                                    experiment_spec=cfg)
    input_size = 620
    model.eval()
    # # 1. build dependency graph
    num_params_before_pruning = tp.utils.count_params( model )
    DG = tp.DependencyGraph()
    out = model(torch.randn([1,3, input_size, input_size]).to(device))
    DG.build_dependency(model, example_inputs=torch.randn([1,3, input_size, input_size]))
    excluded_layers = list(model.model[-1].modules())
    print(excluded_layers)

    # # 2. get global threshold
    global_thresh, module2scores = tp.utils.get_global_thresh(model, prune_ratio=prune_ratio)
    # Hard code the way to find the shortcut connection in YOLOV5 module
    # from models.common import C3
    merged_sets = {}
    for name, m in model.named_modules():
        if isinstance(m, C3):
            if m.shortcut:
                merged_sets[m.cv1.conv] = set()
                for btnk in m.m:
                    merged_sets[m.cv1.conv].add(btnk.cv2.conv)

    # # 3. Execute pruning
    tp.utils.execute_custom_score_prune(model,
                                        global_thresh=global_thresh,
                                        module2scores=module2scores,
                                        dep_graph=DG,
                                        granularity=granularity,
                                        excluded_layers=excluded_layers,
                                        merged_sets=merged_sets)
    num_params_after_pruning = tp.utils.count_params( model )
    print( "  Params: %s => %s"%( num_params_before_pruning, num_params_after_pruning))
    exit(0)
    # #################################################
    # # Torch Pruning (End)
    # #################################################
    model = model.to(device)


if __name__ == "__main__":
    main()

Here i am getting error when i load the model

When i run this , i get the bellow error , can you please advice on the approch i am trying and on how to resolve the error

sys:1: UserWarning: 
'infer' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
/workspace/tao_pytorch_backend/nvidia_tao_pytorch/core/hydra/hydra_runner.py:107: UserWarning: 
'infer' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  _run_hydra(
/usr/local/lib/python3.8/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/next/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
Inference results will be saved at: ./inference
No pretrained configuration specified for convnext_base_in22k model. Using a default. Please add a config to the model pretrained_cfg registry or pass explicitly.
Error executing job with overrides: []
Traceback (most recent call last):
  File "prune.py", line 77, in <module>
    main()
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/core/hydra/hydra_runner.py", line 107, in wrapper
    _run_hydra(
  File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 389, in _run_hydra
    _run_app(
  File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 452, in _run_app
    run_and_report(
  File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 216, in run_and_report
    raise ex
  File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 213, in run_and_report
    return func()
  File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 453, in <lambda>
    lambda: hydra.run(
  File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/usr/local/lib/python3.8/dist-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/usr/local/lib/python3.8/dist-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "prune.py", line 34, in main
    model = DINOPlModel.load_from_checkpoint(model_path,
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
    return _load_from_checkpoint(
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/saving.py", line 180, in _load_from_checkpoint
    return _load_state(cls, checkpoint, strict=strict, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/saving.py", line 225, in _load_state
    obj = cls(**_cls_kwargs)
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/pl_dino_model.py", line 62, in __init__
    self._build_model(export)
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/pl_dino_model.py", line 69, in _build_model
    self.model = build_model(experiment_config=self.experiment_spec, export=export)
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/build_nn_model.py", line 306, in build_model
    model = DINOModel(num_classes=num_classes,
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/build_nn_model.py", line 180, in __init__
    transformer = DeformableTransformer(
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/deformable_transformer.py", line 139, in __init__
    encoder_layer = DeformableTransformerEncoderLayer(d_model, dim_feedforward,
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/deformable_transformer.py", line 836, in __init__
    self.self_attn = MSDeformAttn(d_model, n_levels, n_heads, n_points)
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/deformable_detr/model/ops/modules.py", line 85, in __init__
    load_ops(ops_dir, lib_name)
  File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/deformable_detr/model/ops/functions.py", line 35, in load_ops
    torch.ops.load_library(module_path)
  File "/usr/local/lib/python3.8/dist-packages/torch/_ops.py", line 852, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/deformable_detr/model/ops/MultiScaleDeformableAttention.cpython-310-x86_64-linux-gnu.so: cannot open shared object file: No such file or directory

Can you pelase advice on the approch and the error ,

Morganh · December 23, 2023, 4:28pm

The lib should be available in below path.

$ docker run --runtime=nvidia -it --rm -v /home/morganh:/home/morganh -it --rm nvcr.io/nvidia/tao/tao-toolkit:5.2.0-pyt2.1.0 /bin/bash

ls /usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/cv/deformable_detr/model/ops/MultiScaleDeformableAttention.cpython-310-x86_64-linux-gnu.so

Again, currently for DINO network, it does not support pruning yet. We will plan for it in future release.

rishikesan · December 26, 2023, 2:09pm

"Currently Dino not supported for pruning " does that means , we even can’t create pruning script from the above resource i am using to create a pruning script myself and try to do the pruning

Or does this just means , the Dino code base doesn’t have pruning script specifically for Dino , but we can create one using the pruning sample scripts

Morganh · December 26, 2023, 3:36pm

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Officially the Dino does not implement pruning yet. That means the Dino code doesn’t have pruning script specifically for Dino.

Topic		Replies	Views
Check the availability of pruning for DINO, SegFormer and Classification of pyt TAO Toolkit	3	438	August 30, 2023
Pruning of Customized trained ONNX model via TAO TAO Toolkit ai-training	5	1755	October 6, 2022
Error while pruning .tlt model created during efficientdet-d0 model TAO Toolkit	19	325	July 24, 2024
Lower FPS compared to the unpruned model for the pruned MaskRCNN model TAO Toolkit	46	760	November 14, 2024
Fine Tuning Retail Object Detection Models provided in NGC TAO Toolkit ngc	17	450	February 7, 2025
[TAO 5] [Object Detection] Can't export a DINO model after training successfully. Missing Layers? TAO Toolkit	18	1060	September 29, 2023
Probleme with training/pruning tlt TAO Toolkit yolo	9	1113	September 18, 2020
Tao toolkit observations TAO Toolkit	56	1768	May 29, 2024
Deepstream_lpr_app runs slowly TAO Toolkit	26	1215	November 12, 2021
After pruning and retraining, can I prune again? TAO Toolkit	18	772	November 17, 2022

Pruning Dino model

Related topics