Description
Goal: torch to trt conversion
Script:
import json
import torch
import torch2trt
import trt_pose.coco
import trt_pose.models
with open(‘human_pose.json’, ‘r’) as f:
human_pose = json.load(f)
topology = trt_pose.coco.coco_category_to_topology(human_pose)
num_parts = len(human_pose[‘keypoints’])
num_links = len(human_pose[‘skeleton’])
MODEL_WEIGHTS = ‘resnet18_baseline_att_224x224_A_epoch_249.pth’
model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()
model.load_state_dict(torch.load(MODEL_WEIGHTS))
WIDTH = 224
HEIGHT = 224
data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()
model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)
OPTIMIZED_MODEL = ‘resnet18_baseline_att_224x224_A_epoch_249_trt.pth’
torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)
Running the above in x86_64 machine with A100-SXM4-40GB GPU doesnt work. Get the following error
/usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:208: UserWarning: The parameter ‘pretrained’ is deprecated since 0.13 and may be removed in the future, please use ‘weights’ instead.
warnings.warn(
/usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None
for ‘weights’ are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=ResNet18_Weights.IMAGENET1K_V1
. You can also use weights=ResNet18_Weights.DEFAULT
to get the most up-to-date weights.
warnings.warn(msg)
torch.Size([1, 3, 224, 224])
[06/20/2024-18:34:25] [TRT] [E] Error Code: 3: 1.cmap_up.0:0:DECONVOLUTION:GPU:kernel weights has count 2097152 but 4194304 was expected
[06/20/2024-18:34:25] [TRT] [E] Error Code: 4: 1.cmap_up.0:0:DECONVOLUTION:GPU: count of 2097152 weights in kernel, but kernel dimensions (4,4) with 512 input channels, 512 output channels and 1 groups were specified. Expected Weights count is 512 * 4*4 * 512 / 1 = 4194304
[06/20/2024-18:34:25] [TRT] [E] ITensor::getDimensions: Error Code 4: Internal Error (Output shape can not be computed for node 1.cmap_up.0:0:DECONVOLUTION:GPU.)
[06/20/2024-18:34:25] [TRT] [E] INetworkDefinition::addScaleNd: Error Code 3: API Usage Error (Parameter check failed, condition: qdqScale || basicScale. )
Traceback (most recent call last):
File “/app/notebooks/poses/colab-trt-pose/convert_model_to_trt.py”, line 29, in
model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)
File “/usr/local/lib/python3.10/dist-packages/torch2trt-0.5.0-py3.10-linux-x86_64.egg/torch2trt/torch2trt.py”, line 643, in torch2trt
outputs = module(*inputs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1582, in _call_impl
result = forward_call(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/container.py”, line 217, in forward
input = module(input)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1582, in _call_impl
result = forward_call(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/trt_pose-0.0.1-py3.10-linux-x86_64.egg/trt_pose/models/common.py”, line 70, in forward
xc = self.cmap_up(x)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1582, in _call_impl
result = forward_call(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/container.py”, line 217, in forward
input = module(input)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1582, in _call_impl
result = forward_call(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/batchnorm.py”, line 175, in forward
return F.batch_norm(
File “/usr/local/lib/python3.10/dist-packages/torch2trt-0.5.0-py3.10-linux-x86_64.egg/torch2trt/torch2trt.py”, line 262, in wrapper
converter"converter"
File “/usr/local/lib/python3.10/dist-packages/torch2trt-0.5.0-py3.10-linux-x86_64.egg/torch2trt/converters/native_converters.py”, line 183, in convert_batch_norm
output._trt = layer.get_output(0)
AttributeError: ‘NoneType’ object has no attribute ‘get_output’
Environment
TensorRT Version : 8.6.1.6
GPU Type : NVIDIA A100-SXM4-40GB
Nvidia Driver Version : 535.161.07
CUDA Version : 12.2
Operating System + Version : ubuntu 22.04
Python Version (if applicable) : 3.10.12
PyTorch Version (if applicable) : 2.3.1+cu121
Relevant Files
Model: resnet18_baseline_att_224x224_A resnet18_baseline_att_224x224_A_epoch_249.pth - Google Drive
torch2trt: GitHub - NVIDIA-AI-IOT/torch2trt: An easy to use PyTorch to TensorRT converter
trt-pose: GitHub - NVIDIA-AI-IOT/trt_pose: Real-time pose estimation accelerated with NVIDIA TensorRT
Steps To Reproduce
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered