Please provide the following information when requesting support.
• Hardware (T4)
• Network Type (Dino)
Hi i converted Dino model to FP32 , but the inference speed with batch size 1 is not satisfactory
I want to try some optimization technique using tao_toolkit
How can i do that ,
I was looking for pruning but there is not much good reference on how to do it for Dino
I tried the dino inference with Dino_fan_large fp32 and fp16 with batch size as 1 , can i have higher batch size like 2,4 (FYI: i am running on single T4 GPU) and improve the inference speed
Thanks for the suggestions on different models , yes ResNet is faster , but I thought it is good to have the same performance of Dino FAN-L FP32 (even the FP16 performs little bad than expected )
Currently i am trying to make use of the following reference to implement a pruning step for Dino , so that i can have FP32 with improved inference speed
The pruning general method
Pruning example from OCRNET
Some Dino PTL module usage
The one i have drafted is below
#################################################
# Torch Pruning (Begin)
#################################################
import torch_pruning as tp
import sys
import os
current_dir = os.getcwd()
tao_base = os.path.abspath(os.path.join(current_dir, "../../../../"))
sys.path.append(tao_base)
from nvidia_tao_pytorch.core.hydra.hydra_runner import hydra_runner
from nvidia_tao_pytorch.cv.dino.model.pl_dino_model import DINOPlModel
from nvidia_tao_pytorch.cv.dino.config.default_config import ExperimentConfig
import torch
from nvidia_tao_pytorch.core.utilities import update_results_dir
prune_ratio = 0.2
granularity = 8
model_path = "dino_fan_large_imagenet22k_36ep.pth"
# cfg = ExperimentConfig
# Initialize Hydra
spec_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
@hydra_runner(
config_path=os.path.join(spec_root, "experiment_specs"), config_name="infer", schema=ExperimentConfig
)
def main(cfg: ExperimentConfig) -> None:
cfg = update_results_dir(cfg, task="inference")
model = DINOPlModel.load_from_checkpoint(model_path,
map_location="cpu",
experiment_spec=cfg)
input_size = 620
model.eval()
# # 1. build dependency graph
num_params_before_pruning = tp.utils.count_params( model )
DG = tp.DependencyGraph()
out = model(torch.randn([1,3, input_size, input_size]).to(device))
DG.build_dependency(model, example_inputs=torch.randn([1,3, input_size, input_size]))
excluded_layers = list(model.model[-1].modules())
print(excluded_layers)
# # 2. get global threshold
global_thresh, module2scores = tp.utils.get_global_thresh(model, prune_ratio=prune_ratio)
# Hard code the way to find the shortcut connection in YOLOV5 module
# from models.common import C3
merged_sets = {}
for name, m in model.named_modules():
if isinstance(m, C3):
if m.shortcut:
merged_sets[m.cv1.conv] = set()
for btnk in m.m:
merged_sets[m.cv1.conv].add(btnk.cv2.conv)
# # 3. Execute pruning
tp.utils.execute_custom_score_prune(model,
global_thresh=global_thresh,
module2scores=module2scores,
dep_graph=DG,
granularity=granularity,
excluded_layers=excluded_layers,
merged_sets=merged_sets)
num_params_after_pruning = tp.utils.count_params( model )
print( " Params: %s => %s"%( num_params_before_pruning, num_params_after_pruning))
exit(0)
# #################################################
# # Torch Pruning (End)
# #################################################
model = model.to(device)
if __name__ == "__main__":
main()
Here i am getting error when i load the model
When i run this , i get the bellow error , can you please advice on the approch i am trying and on how to resolve the error
sys:1: UserWarning:
'infer' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
/workspace/tao_pytorch_backend/nvidia_tao_pytorch/core/hydra/hydra_runner.py:107: UserWarning:
'infer' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
_run_hydra(
/usr/local/lib/python3.8/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/next/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
Inference results will be saved at: ./inference
No pretrained configuration specified for convnext_base_in22k model. Using a default. Please add a config to the model pretrained_cfg registry or pass explicitly.
Error executing job with overrides: []
Traceback (most recent call last):
File "prune.py", line 77, in <module>
main()
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/core/hydra/hydra_runner.py", line 107, in wrapper
_run_hydra(
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 389, in _run_hydra
_run_app(
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 452, in _run_app
run_and_report(
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 216, in run_and_report
raise ex
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 213, in run_and_report
return func()
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py", line 453, in <lambda>
lambda: hydra.run(
File "/usr/local/lib/python3.8/dist-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/usr/local/lib/python3.8/dist-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/usr/local/lib/python3.8/dist-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "prune.py", line 34, in main
model = DINOPlModel.load_from_checkpoint(model_path,
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
return _load_from_checkpoint(
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/saving.py", line 180, in _load_from_checkpoint
return _load_state(cls, checkpoint, strict=strict, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/saving.py", line 225, in _load_state
obj = cls(**_cls_kwargs)
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/pl_dino_model.py", line 62, in __init__
self._build_model(export)
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/pl_dino_model.py", line 69, in _build_model
self.model = build_model(experiment_config=self.experiment_spec, export=export)
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/build_nn_model.py", line 306, in build_model
model = DINOModel(num_classes=num_classes,
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/build_nn_model.py", line 180, in __init__
transformer = DeformableTransformer(
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/deformable_transformer.py", line 139, in __init__
encoder_layer = DeformableTransformerEncoderLayer(d_model, dim_feedforward,
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/dino/model/deformable_transformer.py", line 836, in __init__
self.self_attn = MSDeformAttn(d_model, n_levels, n_heads, n_points)
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/deformable_detr/model/ops/modules.py", line 85, in __init__
load_ops(ops_dir, lib_name)
File "/workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/deformable_detr/model/ops/functions.py", line 35, in load_ops
torch.ops.load_library(module_path)
File "/usr/local/lib/python3.8/dist-packages/torch/_ops.py", line 852, in load_library
ctypes.CDLL(path)
File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /workspace/tao_pytorch_backend/nvidia_tao_pytorch/cv/deformable_detr/model/ops/MultiScaleDeformableAttention.cpython-310-x86_64-linux-gnu.so: cannot open shared object file: No such file or directory
Can you pelase advice on the approch and the error ,
"Currently Dino not supported for pruning " does that means , we even can’t create pruning script from the above resource i am using to create a pruning script myself and try to do the pruning
Or does this just means , the Dino code base doesn’t have pruning script specifically for Dino , but we can create one using the pruning sample scripts
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks
Officially the Dino does not implement pruning yet. That means the Dino code doesn’t have pruning script specifically for Dino.