Detectron 2 → ONNX → TensorRT [Bug] KeyError: 'UNKNOWN_SCALAR'. When export_model.py

1、I guarantee I followed the instructions on the README carefully.

2、The only difference is I used my custom data, I’m wondering if the conversion can’t be done after using the custom data?
(According to my understanding, after using custom dataset, only the weight of the model has changed, but the structure of the model has not changed, so Detectron 2 → ONNX → TensorRT should work as well. I don’t know if the current conversion can only be done for Mask R-CNN R50-FPN 3x? Or we can also convert the transfer learning model with custom dataset?).

3、.yaml file changed a little bit for transfer learning.

--------------My Code--------------------------

%run /content/detectron2/tools/deploy/export_model.py \ --sample-image /content/new_1344_1344.jpg \ --config-file /content/output.yaml \ --export-method tracing \ --format onnx \ --output /content \ MODEL.WEIGHTS /content/output/model_final.pth \ MODEL.DEVICE cuda

--------------My Code--------------------------

--------------The Bug--------------------------

`KeyError Traceback (most recent call last)

/content/detectron2/tools/deploy/export_model.py in
224 exported_model = export_scripting(torch_model)
225 elif args.export_method == “tracing”:
→ 226 exported_model = export_tracing(torch_model, sample_inputs)
227
228 # run evaluation with the converted model

8 frames

[/usr/local/lib/python3.8/dist-packages/torch/onnx/symbolic_opset9.py] in to(g, self, *args)
1994 # aten::to(Tensor, Tensor, bool, bool, memory_format)
1995 dtype = args[0].type().scalarType()
→ 1996 return g.op(“Cast”, self, to_i=sym_help.cast_pytorch_to_onnx[dtype])
1997 else:
1998 # aten::to(Tensor, ScalarType, bool, bool, memory_format)

KeyError: ‘UNKNOWN_SCALAR’`

--------------The Bug--------------------------

--------------Environment Information--------------------------

PyTorch version: 1.10.1+cu111
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: 10.0.0-4ubuntu1
CMake version: version 3.22.6
Libc version: glibc-2.31

Python version: 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.10.147±x86_64-with-glibc2.29
Is CUDA available: True
CUDA runtime version: 11.2.152
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: Tesla T4
Nvidia driver version: 510.47.03
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.1.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 2
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) CPU @ 2.00GHz
Stepping: 3
CPU MHz: 2000.210
BogoMIPS: 4000.42
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32 KiB
L1i cache: 32 KiB
L2 cache: 1 MiB
L3 cache: 38.5 MiB
NUMA node0 CPU(s): 0,1
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Mitigation; PTE Inversion
Vulnerability Mds: Vulnerable; SMT Host state unknown
Vulnerability Meltdown: Vulnerable
Vulnerability Mmio stale data: Vulnerable
Vulnerability Retbleed: Vulnerable
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers
Vulnerability Spectre v2: Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat md_clear arch_capabilities

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.21.6
[pip3] torch==1.10.1+cu111
[pip3] torchaudio==0.10.1+rocm4.1
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.14.1
[pip3] torchvision==0.11.2+cu111
[conda] Could not collect

--------------Environment Information--------------------------

Hi,

We recommend you please reach out to Issues · NVIDIA/TensorRT · GitHub to get better help regarding the above sample setup.

Thank you.

Hi dear Moderator,

I have already tried to solve this problem for a month… But still didn’t achieve.

I already reach out to Issues · NVIDIA/TensorRT · GitHub , but no one answer.

Could you please help me…