Description
Goal: torch to trt conversion
Script:
data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()
if os.path.exists(OPTIMIZED_MODEL) == False:
print(‘-- Converting TensorRT models. This may takes several minutes…’)
model.load_state_dict(torch.load(MODEL_WEIGHTS))
model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)
torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)
A clear and concise description of the bug or issue.
(1) Above works on Jetson
(2) Running the above in x86 machine with T4 GPU doesnt work. Get the following error
------ model = resnet--------
/home/user/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter ‘pretrained’ is deprecated since 0.13 and may be removed in the future, please use ‘weights’ instead.
warnings.warn(
/home/user/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None
for ‘weights’ are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=ResNet18_Weights.IMAGENET1K_V1
. You can also use weights=ResNet18_Weights.DEFAULT
to get the most up-to-date weights.
warnings.warn(msg)
– Converting TensorRT models. This may takes several minutes…
HEIGHT WIDTH 224 224
input shape torch.Size([64, 3, 7, 7])
[06/18/2024-02:25:14] [TRT] [E] Error Code: 3: 1.cmap_up.0:0:DECONVOLUTION:GPU:kernel weights has count 2097152 but 4194304 was expected
[06/18/2024-02:25:14] [TRT] [E] Error Code: 4: 1.cmap_up.0:0:DECONVOLUTION:GPU: count of 2097152 weights in kernel, but kernel dimensions (4,4) with 512 input channels, 512 output channels and 1 groups were specified. Expected Weights count is 512 * 4*4 * 512 / 1 = 4194304
[06/18/2024-02:25:14] [TRT] [E] ITensor::getDimensions: Error Code 4: Internal Error (Output shape can not be computed for node 1.cmap_up.0:0:DECONVOLUTION:GPU.)
[06/18/2024-02:25:14] [TRT] [E] INetworkDefinition::addScaleNd: Error Code 3: API Usage Error (Parameter check failed, condition: qdqScale || basicScale. )
Traceback (most recent call last):
File “/home/user/pose_310/trt_pose/tasks/human_pose/video-pose-3.py”, line 228, in
model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)
File “/home/user/.local/lib/python3.10/site-packages/torch2trt-0.5.0-py3.10-linux-x86_64.egg/torch2trt/torch2trt.py”, line 643, in torch2trt
outputs = module(*inputs)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1568, in _call_impl
result = forward_call(*args, **kwargs)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/container.py”, line 215, in forward
input = module(input)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1568, in _call_impl
result = forward_call(*args, **kwargs)
File “/home/user/.local/lib/python3.10/site-packages/trt_pose-0.0.1-py3.10-linux-x86_64.egg/trt_pose/models/common.py”, line 70, in forward
xc = self.cmap_up(x)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1568, in _call_impl
result = forward_call(*args, **kwargs)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/container.py”, line 215, in forward
input = module(input)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1568, in _call_impl
result = forward_call(*args, **kwargs)
File “/home/user/.local/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py”, line 171, in forward
return F.batch_norm(
File “/home/user/.local/lib/python3.10/site-packages/torch2trt-0.5.0-py3.10-linux-x86_64.egg/torch2trt/torch2trt.py”, line 262, in wrapper
converter"converter"
File “/home/user/.local/lib/python3.10/site-packages/torch2trt-0.5.0-py3.10-linux-x86_64.egg/torch2trt/converters/native_converters.py”, line 183, in convert_batch_norm
output._trt = layer.get_output(0)
Environment
TensorRT Version: 10.1.0
GPU Type: T4
Nvidia Driver Version: 525.125.06
CUDA Version: 12.1
CUDNN Version: 8.9
Operating System + Version: ubuntu 20.04 LTS
Python Version (if applicable): 3.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.1.2+cu121
Baremetal or Container (if container which image + tag):
Relevant Files
Model: resnet18_baseline_att_224x224_A resnet18_baseline_att_224x224_A_epoch_249.pth - Google Drive
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
Steps To Reproduce
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered