Error Code 10: Internal Error (Could not find any implementation for node

alsozatch · January 23, 2024, 10:11pm

Description

Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul).)

Reached above error when exporting PyTorch to TensorRT with INT8 flag set. I used the calibration class from tensorrt-utils/int8/calibration/ImagenetCalibrator.py at master · rmccorm4/tensorrt-utils · GitHub

I’m exporting using Ultralytics (Yolov8) except they don’t support Int8 so I copied the Exporter class and implemented it myself just like in tensorrt-utils above. It works for fp16 but not int8.

Tried every suggestion I found online for similar errors. Tried increasing workspace size. It doesn’t change, still uses at most 15 GB RAM. Tried removing workspace size setting so default is used, same thing. Tried adding tensorrt to LD_LIBRARY_PATH, no effect.

Suggestions? Thanks.

Environment

TensorRT Version: 8.5.2.2
GPU Type: Ampere (AGX Orin)
Nvidia Driver Version:
CUDA Version: 11.4
CUDNN Version:
Operating System + Version: Ubuntu 20.0.4
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.0.0
Baremetal or Container (if container which image + tag): Baremetal

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

[01/23/2024-15:39:58] [TRT] [I] [GpuLayer] COPY: /model.22/Mul_2_output_0 copy
[01/23/2024-15:39:58] [TRT] [I] [GpuLayer] COPY: /model.22/Sigmoid_output_0 copy
[01/23/2024-15:39:58] [TRT] [I] [GpuLayer] COPY: /model.22/Concat_output_0 copy
[01/23/2024-15:39:58] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +34, GPU +88, now: CPU 3123, GPU 15213 (MiB)
[01/23/2024-15:39:58] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +36, GPU +76, now: CPU 3159, GPU 15289 (MiB)
[01/23/2024-15:39:58] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x5dcda6f3b1eea89a due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x6c9b9925c4cc67b0 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x4798bd5eea3be0d6 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xfbca5e767c4ed4f2 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xfdf7509af98902e0 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x214bdfa026549ff2 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xc985777c89c6b3a4 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x6176c23707257237 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x01cd56dfbdb5c0ee due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x00c7d39818f4aff2 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x552ac687d7891695 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xad8a45d1c06da185 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x6fd15a9d85252b17 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x57f2a1d1b8552d02 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xafad4a0ea10d6400 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x2f5bc3e6bb27ae43 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x179844a379940fc2 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x698ab7d6de17ffeb due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xc722efd60bc6ea84 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x3ac8602b2543f50d due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xbd976ef514eaa406 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x7251b68d123da92b due to exception Cask convolution execution
[01/23/2024-15:48:38] [TRT] [E] 10: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul).)
TensorRT: export failure ❌ 543.7s: enter
Traceback (most recent call last):
File “tensorrt_export.py”, line 33, in
main()
File “tensorrt_export.py”, line 29, in main
export(model=model, format=‘engine’, imgsz=(960,1280), workspace=50, int8=True)
File “tensorrt_export.py”, line 16, in export
return Exporter(overrides=args, _callbacks=model.callbacks)(model=model.model)
File “/home/nvidia/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py”, line 115, in decorate_context
return func(*args, **kwargs)
File “/home/nvidia/zach/yolo-jetson/int8_exporter.py”, line 254, in call
f[1], _ = self.export_engine()
File “/home/nvidia/zach/yolo-jetson/int8_exporter.py”, line 124, in outer_func
raise e
File “/home/nvidia/zach/yolo-jetson/int8_exporter.py”, line 119, in outer_func
f, model = inner_func(*args, **kwargs)
File “/home/nvidia/zach/yolo-jetson/int8_exporter.py”, line 421, in export_engine
with builder.build_engine(network, config) as engine, open(f, ‘wb’) as t:
AttributeError: enter

github.com

rmccorm4/tensorrt-utils/blob/master/int8/calibration/ImagenetCalibrator.py

# Copyright 2019 NVIDIA Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
import sys
import glob
import random
import logging

This file has been truncated. show original

AakankshaS · January 24, 2024, 5:45am

Can you pls confirm which opset you are using ?
While parsing your model using trtexec command, you can see the opset in the logs.

For sigmoid you would need opset 17.

alsozatch · January 24, 2024, 2:29pm

Thanks for the quick reply. I’m not using trtexec, rather the Builder object in Python. I’ll include the function below. It is mostly from Ultralytics and it works for FP16. I only added the calibrator and config.set_flag(trt.BuilderFlag.INT8). Calibrator itself seems to have no issues.

Summary

@try_export
    def export_engine(self, prefix=colorstr('TensorRT:')):
        """YOLOv8 TensorRT export https://developer.nvidia.com/tensorrt."""
        assert self.im.device.type != 'cpu', "export running on CPU but must be on GPU, i.e. use 'device=0'"
        try:
            import tensorrt as trt  # noqa
        except ImportError:
            if LINUX:
                check_requirements('nvidia-tensorrt', cmds='-U --index-url https://pypi.ngc.nvidia.com')
            import tensorrt as trt  # noqa

        check_version(trt.__version__, '7.0.0', hard=True)  # require tensorrt>=7.0.0
        self.args.simplify = True
        f_onnx, _ = self.export_onnx()

        LOGGER.info(f'\n{prefix} starting export with TensorRT {trt.__version__}...')
        assert Path(f_onnx).exists(), f'failed to export ONNX file: {f_onnx}'
        f = self.file.with_suffix('.engine')  # TensorRT engine file
        logger = trt.Logger(trt.Logger.INFO)
        if self.args.verbose:
            logger.min_severity = trt.Logger.Severity.VERBOSE

        builder = trt.Builder(logger)
        config = builder.create_builder_config()
        config.max_workspace_size = self.args.workspace * 1 << 30
        #config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, self.args.workspace << 30)  # fix TRT 8.4 deprecation notice

        flag = (1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
        network = builder.create_network(flag)
        parser = trt.OnnxParser(network, logger)
        if not parser.parse_from_file(f_onnx):
            raise RuntimeError(f'failed to load ONNX file: {f_onnx}')

        inputs = [network.get_input(i) for i in range(network.num_inputs)]
        outputs = [network.get_output(i) for i in range(network.num_outputs)]
        for inp in inputs:
            LOGGER.info(f'{prefix} input "{inp.name}" with shape{inp.shape} {inp.dtype}')
        for out in outputs:
            LOGGER.info(f'{prefix} output "{out.name}" with shape{out.shape} {out.dtype}')

        if self.args.dynamic:
            shape = self.im.shape
            if shape[0] <= 1:
                LOGGER.warning(f"{prefix} WARNING ⚠️ 'dynamic=True' model requires max batch size, i.e. 'batch=16'")
            profile = builder.create_optimization_profile()
            for inp in inputs:
                profile.set_shape(inp.name, (1, *shape[1:]), (max(1, shape[0] // 2), *shape[1:]), shape)
            config.add_optimization_profile(profile)

        dtype = "FP32"
        if builder.platform_has_fast_fp16 and self.args.half:
            dtype = "FP16"
            config.set_flag(trt.BuilderFlag.FP16)
        elif builder.platform_has_fast_int8 and self.args.int8:
            from calibrator import ImagenetCalibrator, get_int8_calibrator
            dtype = "INT8"
            config.set_flag(trt.BuilderFlag.INT8)
            config.int8_calibrator = get_int8_calibrator(calib_cache="calibration.cache", calib_data="./ims2/", \
                                                         max_calib_size=16, preprocess_func_name='preprocess_yolo', calib_batch_size=8)
        LOGGER.info(
            f'{prefix} building {dtype} engine as {f}')

        del self.model
        torch.cuda.empty_cache()

        # Write file
        with builder.build_engine(network, config) as engine, open(f, 'wb') as t:
            # Metadata
            meta = json.dumps(self.metadata)
            t.write(len(meta).to_bytes(4, byteorder='little', signed=True))
            t.write(meta.encode())
            # Model
            t.write(engine.serialize())

        return f, None

alsozatch · January 24, 2024, 2:35pm

Ah, I found it anyways.

ONNX: starting export with onnx 1.15.0 opset 17…

alsozatch · January 24, 2024, 3:44pm

I tried enabling both fp16 and int8 (previously only int8 enabled). Same error. Also the error occurs on the very first layer. I think it might have to do with PWN. Here is the full output so you can see the layers and some warnings. The only one that seems like it could cause the issue is “The CUDA context changed between createInferBuilder and buildSerializedNetwork. A Builder holds CUDA resources which cannot be shared across CUDA contexts, so access these in different CUDA context results in undefined behavior. If using pycuda, try import pycuda.autoinit before importing tensorrt.”

Summary

YOLOv8x-seg summary (fused): 295 layers, 71724508 parameters, 0 gradients, 343.7 GFLOPs

PyTorch: starting from 'oct_sliced_9_1_2023.pt' with input shape (1, 3, 960, 1280) BCHW and output shape(s) ((1, 40, 25200), (1, 32, 240, 320)) (137.4 MB)

ONNX: starting export with onnx 1.15.0 opset 17...
============ Diagnostic Run torch.onnx.export version 2.0.0+nv23.05 ============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

ONNX: simplifying with onnxsim 0.4.33...
ONNX: export success ✅ 7.8s, saved as 'oct_sliced_9_1_2023.onnx' (137.0 MB)

TensorRT: starting export with TensorRT 8.5.2.2...
[01/24/2024-08:55:37] [TRT] [I] [MemUsageChange] Init CUDA: CPU +215, GPU +0, now: CPU 1990, GPU 15055 (MiB)
[01/24/2024-08:55:39] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +303, GPU +149, now: CPU 2315, GPU 15203 (MiB)
[01/24/2024-08:55:39] [TRT] [I] ----------------------------------------------------------------
[01/24/2024-08:55:39] [TRT] [I] Input filename:   oct_sliced_9_1_2023.onnx
[01/24/2024-08:55:39] [TRT] [I] ONNX IR version:  0.0.8
[01/24/2024-08:55:39] [TRT] [I] Opset version:    17
[01/24/2024-08:55:39] [TRT] [I] Producer name:    pytorch
[01/24/2024-08:55:39] [TRT] [I] Producer version: 2.0.0
[01/24/2024-08:55:39] [TRT] [I] Domain:           
[01/24/2024-08:55:39] [TRT] [I] Model version:    0
[01/24/2024-08:55:39] [TRT] [I] Doc string:       
[01/24/2024-08:55:39] [TRT] [I] ----------------------------------------------------------------
[01/24/2024-08:55:39] [TRT] [W] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
TensorRT: input "images" with shape(1, 3, 960, 1280) DataType.HALF
TensorRT: output "output0" with shape(1, 40, 25200) DataType.HALF
TensorRT: output "output1" with shape(1, 32, 240, 320) DataType.HALF
2024-01-24 08:55:39 - calibrator - INFO - Skipping calibration files, using calibration cache: calibration.cache
TensorRT: building INT8 engine as oct_sliced_9_1_2023.engine
[01/24/2024-08:55:41] [TRT] [W] The CUDA context changed between createInferBuilder and buildSerializedNetwork. A Builder holds CUDA resources which cannot be shared across CUDA contexts, so access these in different CUDA context results in undefined behavior. If using pycuda, try import pycuda.autoinit before importing tensorrt.
2024-01-24 08:55:41 - calibrator - INFO - Using calibration cache to save time: calibration.cache
[01/24/2024-08:55:41] [TRT] [I] Reading Calibration Cache for calibrator: EntropyCalibration2
[01/24/2024-08:55:41] [TRT] [I] Generated calibration scales using calibration cache. Make sure that calibration cache has latest scales.
[01/24/2024-08:55:41] [TRT] [I] To regenerate calibration cache, please delete the existing one. TensorRT will generate a new calibration cache.
2024-01-24 08:55:41 - calibrator - INFO - Using calibration cache to save time: calibration.cache
[01/24/2024-08:55:41] [TRT] [W] Missing scale and zero-point for tensor /model.22/dfl/Softmax_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[01/24/2024-08:55:41] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer* 405) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[01/24/2024-08:55:41] [TRT] [I] ---------- Layers Running on DLA ----------
[01/24/2024-08:55:41] [TRT] [I] ---------- Layers Running on GPU ----------
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.1/conv/Conv + PWN(PWN(/model.1/act/Sigmoid), /model.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/cv1/conv/Conv + PWN(PWN(/model.2/cv1/act/Sigmoid), /model.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.0/cv1/conv/Conv + PWN(PWN(/model.2/m.0/cv1/act/Sigmoid), /model.2/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.0/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.2/m.0/cv2/act/Sigmoid), /model.2/m.0/cv2/act/Mul), /model.2/m.0/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.1/cv1/conv/Conv + PWN(PWN(/model.2/m.1/cv1/act/Sigmoid), /model.2/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.1/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.2/m.1/cv2/act/Sigmoid), /model.2/m.1/cv2/act/Mul), /model.2/m.1/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.2/cv1/conv/Conv + PWN(PWN(/model.2/m.2/cv1/act/Sigmoid), /model.2/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.2/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.2/m.2/cv2/act/Sigmoid), /model.2/m.2/cv2/act/Mul), /model.2/m.2/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.2/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.2/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.2/m.0/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.2/m.1/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/cv2/conv/Conv + PWN(PWN(/model.2/cv2/act/Sigmoid), /model.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.3/conv/Conv + PWN(PWN(/model.3/act/Sigmoid), /model.3/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/cv1/conv/Conv + PWN(PWN(/model.4/cv1/act/Sigmoid), /model.4/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.0/cv1/conv/Conv + PWN(PWN(/model.4/m.0/cv1/act/Sigmoid), /model.4/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.0/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.0/cv2/act/Sigmoid), /model.4/m.0/cv2/act/Mul), /model.4/m.0/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.1/cv1/conv/Conv + PWN(PWN(/model.4/m.1/cv1/act/Sigmoid), /model.4/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.1/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.1/cv2/act/Sigmoid), /model.4/m.1/cv2/act/Mul), /model.4/m.1/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.2/cv1/conv/Conv + PWN(PWN(/model.4/m.2/cv1/act/Sigmoid), /model.4/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.2/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.2/cv2/act/Sigmoid), /model.4/m.2/cv2/act/Mul), /model.4/m.2/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.3/cv1/conv/Conv + PWN(PWN(/model.4/m.3/cv1/act/Sigmoid), /model.4/m.3/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.3/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.3/cv2/act/Sigmoid), /model.4/m.3/cv2/act/Mul), /model.4/m.3/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.4/cv1/conv/Conv + PWN(PWN(/model.4/m.4/cv1/act/Sigmoid), /model.4/m.4/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.4/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.4/cv2/act/Sigmoid), /model.4/m.4/cv2/act/Mul), /model.4/m.4/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.5/cv1/conv/Conv + PWN(PWN(/model.4/m.5/cv1/act/Sigmoid), /model.4/m.5/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.5/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.5/cv2/act/Sigmoid), /model.4/m.5/cv2/act/Mul), /model.4/m.5/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.0/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.1/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.2/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.3/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.4/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/cv2/conv/Conv + PWN(PWN(/model.4/cv2/act/Sigmoid), /model.4/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.5/conv/Conv + PWN(PWN(/model.5/act/Sigmoid), /model.5/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/cv1/conv/Conv + PWN(PWN(/model.6/cv1/act/Sigmoid), /model.6/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.0/cv1/conv/Conv + PWN(PWN(/model.6/m.0/cv1/act/Sigmoid), /model.6/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.0/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.0/cv2/act/Sigmoid), /model.6/m.0/cv2/act/Mul), /model.6/m.0/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.1/cv1/conv/Conv + PWN(PWN(/model.6/m.1/cv1/act/Sigmoid), /model.6/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.1/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.1/cv2/act/Sigmoid), /model.6/m.1/cv2/act/Mul), /model.6/m.1/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.2/cv1/conv/Conv + PWN(PWN(/model.6/m.2/cv1/act/Sigmoid), /model.6/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.2/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.2/cv2/act/Sigmoid), /model.6/m.2/cv2/act/Mul), /model.6/m.2/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.3/cv1/conv/Conv + PWN(PWN(/model.6/m.3/cv1/act/Sigmoid), /model.6/m.3/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.3/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.3/cv2/act/Sigmoid), /model.6/m.3/cv2/act/Mul), /model.6/m.3/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.4/cv1/conv/Conv + PWN(PWN(/model.6/m.4/cv1/act/Sigmoid), /model.6/m.4/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.4/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.4/cv2/act/Sigmoid), /model.6/m.4/cv2/act/Mul), /model.6/m.4/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.5/cv1/conv/Conv + PWN(PWN(/model.6/m.5/cv1/act/Sigmoid), /model.6/m.5/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.5/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.5/cv2/act/Sigmoid), /model.6/m.5/cv2/act/Mul), /model.6/m.5/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.0/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.1/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.2/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.3/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.4/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/cv2/conv/Conv + PWN(PWN(/model.6/cv2/act/Sigmoid), /model.6/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.7/conv/Conv + PWN(PWN(/model.7/act/Sigmoid), /model.7/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/cv1/conv/Conv + PWN(PWN(/model.8/cv1/act/Sigmoid), /model.8/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.0/cv1/conv/Conv + PWN(PWN(/model.8/m.0/cv1/act/Sigmoid), /model.8/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.0/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.8/m.0/cv2/act/Sigmoid), /model.8/m.0/cv2/act/Mul), /model.8/m.0/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.1/cv1/conv/Conv + PWN(PWN(/model.8/m.1/cv1/act/Sigmoid), /model.8/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.1/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.8/m.1/cv2/act/Sigmoid), /model.8/m.1/cv2/act/Mul), /model.8/m.1/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.2/cv1/conv/Conv + PWN(PWN(/model.8/m.2/cv1/act/Sigmoid), /model.8/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.2/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.8/m.2/cv2/act/Sigmoid), /model.8/m.2/cv2/act/Mul), /model.8/m.2/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.8/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.8/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.8/m.0/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.8/m.1/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/cv2/conv/Conv + PWN(PWN(/model.8/cv2/act/Sigmoid), /model.8/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.9/cv1/conv/Conv + PWN(PWN(/model.9/cv1/act/Sigmoid), /model.9/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POOLING: /model.9/m/MaxPool
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POOLING: /model.9/m_1/MaxPool
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POOLING: /model.9/m_2/MaxPool
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.9/cv1/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.9/m/MaxPool_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.9/m_1/MaxPool_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.9/cv2/conv/Conv + PWN(PWN(/model.9/cv2/act/Sigmoid), /model.9/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] RESIZE: /model.10/Resize
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.10/Resize_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/cv1/conv/Conv + PWN(PWN(/model.12/cv1/act/Sigmoid), /model.12/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.0/cv1/conv/Conv + PWN(PWN(/model.12/m.0/cv1/act/Sigmoid), /model.12/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.0/cv2/conv/Conv + PWN(PWN(/model.12/m.0/cv2/act/Sigmoid), /model.12/m.0/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.1/cv1/conv/Conv + PWN(PWN(/model.12/m.1/cv1/act/Sigmoid), /model.12/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.1/cv2/conv/Conv + PWN(PWN(/model.12/m.1/cv2/act/Sigmoid), /model.12/m.1/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.2/cv1/conv/Conv + PWN(PWN(/model.12/m.2/cv1/act/Sigmoid), /model.12/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.2/cv2/conv/Conv + PWN(PWN(/model.12/m.2/cv2/act/Sigmoid), /model.12/m.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/m.0/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/m.1/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/cv2/conv/Conv + PWN(PWN(/model.12/cv2/act/Sigmoid), /model.12/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] RESIZE: /model.13/Resize
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.13/Resize_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/cv1/conv/Conv + PWN(PWN(/model.15/cv1/act/Sigmoid), /model.15/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.0/cv1/conv/Conv + PWN(PWN(/model.15/m.0/cv1/act/Sigmoid), /model.15/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.0/cv2/conv/Conv + PWN(PWN(/model.15/m.0/cv2/act/Sigmoid), /model.15/m.0/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.1/cv1/conv/Conv + PWN(PWN(/model.15/m.1/cv1/act/Sigmoid), /model.15/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.1/cv2/conv/Conv + PWN(PWN(/model.15/m.1/cv2/act/Sigmoid), /model.15/m.1/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.2/cv1/conv/Conv + PWN(PWN(/model.15/m.2/cv1/act/Sigmoid), /model.15/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.2/cv2/conv/Conv + PWN(PWN(/model.15/m.2/cv2/act/Sigmoid), /model.15/m.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.15/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.15/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.15/m.0/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.15/m.1/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/cv2/conv/Conv + PWN(PWN(/model.15/cv2/act/Sigmoid), /model.15/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.16/conv/Conv + PWN(PWN(/model.16/act/Sigmoid), /model.16/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/cv1/conv/Conv + PWN(PWN(/model.18/cv1/act/Sigmoid), /model.18/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.0/cv1/conv/Conv + PWN(PWN(/model.18/m.0/cv1/act/Sigmoid), /model.18/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.0/cv2/conv/Conv + PWN(PWN(/model.18/m.0/cv2/act/Sigmoid), /model.18/m.0/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.1/cv1/conv/Conv + PWN(PWN(/model.18/m.1/cv1/act/Sigmoid), /model.18/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.1/cv2/conv/Conv + PWN(PWN(/model.18/m.1/cv2/act/Sigmoid), /model.18/m.1/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.2/cv1/conv/Conv + PWN(PWN(/model.18/m.2/cv1/act/Sigmoid), /model.18/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.2/cv2/conv/Conv + PWN(PWN(/model.18/m.2/cv2/act/Sigmoid), /model.18/m.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.18/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.18/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.18/m.0/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.18/m.1/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/cv2/conv/Conv + PWN(PWN(/model.18/cv2/act/Sigmoid), /model.18/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.19/conv/Conv + PWN(PWN(/model.19/act/Sigmoid), /model.19/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.9/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/cv1/conv/Conv + PWN(PWN(/model.21/cv1/act/Sigmoid), /model.21/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.0/cv1/conv/Conv + PWN(PWN(/model.21/m.0/cv1/act/Sigmoid), /model.21/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.0/cv2/conv/Conv + PWN(PWN(/model.21/m.0/cv2/act/Sigmoid), /model.21/m.0/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.1/cv1/conv/Conv + PWN(PWN(/model.21/m.1/cv1/act/Sigmoid), /model.21/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.1/cv2/conv/Conv + PWN(PWN(/model.21/m.1/cv2/act/Sigmoid), /model.21/m.1/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.2/cv1/conv/Conv + PWN(PWN(/model.21/m.2/cv1/act/Sigmoid), /model.21/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.2/cv2/conv/Conv + PWN(PWN(/model.21/m.2/cv2/act/Sigmoid), /model.21/m.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.21/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.21/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.21/m.0/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.21/m.1/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/cv2/conv/Conv + PWN(PWN(/model.21/cv2/act/Sigmoid), /model.21/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/proto/cv1/conv/Conv + PWN(PWN(/model.22/proto/cv1/act/Sigmoid), /model.22/proto/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] DECONVOLUTION: /model.22/proto/upsample/ConvTranspose
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/proto/cv2/conv/Conv + PWN(PWN(/model.22/proto/cv2/act/Sigmoid), /model.22/proto/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/proto/cv3/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(/model.22/proto/cv3/act/Sigmoid), /model.22/proto/cv3/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.0/cv4.0.0/conv/Conv + PWN(PWN(/model.22/cv4.0/cv4.0.0/act/Sigmoid), /model.22/cv4.0/cv4.0.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.0/cv4.0.1/conv/Conv + PWN(PWN(/model.22/cv4.0/cv4.0.1/act/Sigmoid), /model.22/cv4.0/cv4.0.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.0/cv4.0.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.1/cv4.1.0/conv/Conv + PWN(PWN(/model.22/cv4.1/cv4.1.0/act/Sigmoid), /model.22/cv4.1/cv4.1.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.1/cv4.1.1/conv/Conv + PWN(PWN(/model.22/cv4.1/cv4.1.1/act/Sigmoid), /model.22/cv4.1/cv4.1.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.1/cv4.1.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_1
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_1_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.2/cv4.2.0/conv/Conv + PWN(PWN(/model.22/cv4.2/cv4.2.0/act/Sigmoid), /model.22/cv4.2/cv4.2.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.2/cv4.2.1/conv/Conv + PWN(PWN(/model.22/cv4.2/cv4.2.1/act/Sigmoid), /model.22/cv4.2/cv4.2.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.2/cv4.2.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_2
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_2_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.0/cv2.0.0/conv/Conv + PWN(PWN(/model.22/cv2.0/cv2.0.0/act/Sigmoid), /model.22/cv2.0/cv2.0.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.0/cv2.0.1/conv/Conv + PWN(PWN(/model.22/cv2.0/cv2.0.1/act/Sigmoid), /model.22/cv2.0/cv2.0.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.0/cv2.0.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.0/cv3.0.0/conv/Conv + PWN(PWN(/model.22/cv3.0/cv3.0.0/act/Sigmoid), /model.22/cv3.0/cv3.0.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.0/cv3.0.1/conv/Conv + PWN(PWN(/model.22/cv3.0/cv3.0.1/act/Sigmoid), /model.22/cv3.0/cv3.0.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.0/cv3.0.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.1/cv2.1.0/conv/Conv + PWN(PWN(/model.22/cv2.1/cv2.1.0/act/Sigmoid), /model.22/cv2.1/cv2.1.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.1/cv2.1.1/conv/Conv + PWN(PWN(/model.22/cv2.1/cv2.1.1/act/Sigmoid), /model.22/cv2.1/cv2.1.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.1/cv2.1.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.1/cv3.1.0/conv/Conv + PWN(PWN(/model.22/cv3.1/cv3.1.0/act/Sigmoid), /model.22/cv3.1/cv3.1.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.1/cv3.1.1/conv/Conv + PWN(PWN(/model.22/cv3.1/cv3.1.1/act/Sigmoid), /model.22/cv3.1/cv3.1.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.1/cv3.1.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.2/cv2.2.0/conv/Conv + PWN(PWN(/model.22/cv2.2/cv2.2.0/act/Sigmoid), /model.22/cv2.2/cv2.2.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.2/cv2.2.1/conv/Conv + PWN(PWN(/model.22/cv2.2/cv2.2.1/act/Sigmoid), /model.22/cv2.2/cv2.2.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.2/cv2.2.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.2/cv3.2.0/conv/Conv + PWN(PWN(/model.22/cv3.2/cv3.2.0/act/Sigmoid), /model.22/cv3.2/cv3.2.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.2/cv3.2.1/conv/Conv + PWN(PWN(/model.22/cv3.2/cv3.2.1/act/Sigmoid), /model.22/cv3.2/cv3.2.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.2/cv3.2.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_3
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_3_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_4
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_4_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_5
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_5_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/dfl/Reshape + /model.22/dfl/Transpose
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SOFTMAX: /model.22/dfl/Softmax
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/dfl/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/dfl/Reshape_1
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONSTANT: /model.22/Constant_12_output_0
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONSTANT: /model.22/Constant_12_output_0_clone_1
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] ELEMENTWISE: /model.22/Sub
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(/model.22/Add_1)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(/model.22/Constant_14_output_0 + (Unnamed Layer* 406) [Shuffle], PWN(/model.22/Add_2, /model.22/Div_1))
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] ELEMENTWISE: /model.22/Sub_1
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONSTANT: /model.22/Constant_15_output_0 + (Unnamed Layer* 411) [Shuffle]
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] ELEMENTWISE: /model.22/Mul_2
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(/model.22/Sigmoid)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Mul_2_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Sigmoid_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Concat_output_0 copy
[01/24/2024-08:55:42] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +34, GPU +116, now: CPU 2848, GPU 15590 (MiB)
[01/24/2024-08:55:42] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +36, GPU +121, now: CPU 2884, GPU 15711 (MiB)
[01/24/2024-08:55:42] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0x5dcda6f3b1eea89a due to exception Cask convolution execution
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0x6c9b9925c4cc67b0 due to exception Cask convolution execution
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0x4798bd5eea3be0d6 due to exception Cask convolution execution
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0xfbca5e767c4ed4f2 due to exception Cask convolution execution
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0xfdf7509af98902e0 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x214bdfa026549ff2 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xc985777c89c6b3a4 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x6176c23707257237 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x01cd56dfbdb5c0ee due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x00c7d39818f4aff2 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x552ac687d7891695 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xad8a45d1c06da185 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x6fd15a9d85252b17 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x57f2a1d1b8552d02 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xafad4a0ea10d6400 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x2f5bc3e6bb27ae43 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x179844a379940fc2 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x698ab7d6de17ffeb due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xc722efd60bc6ea84 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x3ac8602b2543f50d due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xbd976ef514eaa406 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x7251b68d123da92b due to exception Cask convolution execution
[01/24/2024-09:38:02] [TRT] [E] 10: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul).)
TensorRT: export failure ❌ 2553.1s: __enter__
Traceback (most recent call last):
  File "tensorrt_export.py", line 33, in <module>
    main()
  File "tensorrt_export.py", line 29, in main
    export(model=model, format='engine', imgsz=(960,1280), workspace=24, half=True, int8=True)
  File "tensorrt_export.py", line 16, in export
    return Exporter(overrides=args, _callbacks=model.callbacks)(model=model.model)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/nvidia/zach/yolo-jetson/int8_exporter.py", line 254, in __call__
    f[1], _ = self.export_engine()
  File "/home/nvidia/zach/yolo-jetson/int8_exporter.py", line 124, in outer_func
    raise e
  File "/home/nvidia/zach/yolo-jetson/int8_exporter.py", line 119, in outer_func
    f, model = inner_func(*args, **kwargs)
  File "/home/nvidia/zach/yolo-jetson/int8_exporter.py", line 421, in export_engine
    with builder.build_engine(network, config) as engine, open(f, 'wb') as t:
AttributeError: __enter__

AakankshaS · January 31, 2024, 6:45am

Hi @alsozatch ,
Checking on this.

ashish-roopan · February 5, 2024, 10:03am

I am also getting the same error, saying that there is no implementation for /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul).).

I used the this script to do the export. TensorRT-For-YOLO-Series/export.py at main · Linaom1214/TensorRT-For-YOLO-Series · GitHub

Here are the things I found.

The code works for fp32 and fp16 and only fails if we do int8 caliberation with a caliberation.cache file.
The code works for all fp32 ,fp16 and in8 in trtexec without doing a caliberation.(I tested it only without doing any caliberation)
The code works if we do int8 caliberation without giving cache.

I think the error was happening due to the wrong caliberation.cache file. Its working fine when I created a new cache.

alsozatch · February 5, 2024, 6:25pm

Thanks for the info. When you say it’s working fine, have you checked if it actually provides speedup over fp16? I’m currently using the yolo export since I want the file saved as a .engine which can be loaded directly into ultralytics again (output of trtexec doesn’t have attached metadata and can’t be loaded into yolo models after quantization). I might try that repo but I can’t really compare the results of that to my previous tests because of the different format I think.

ashish-roopan · February 7, 2024, 1:10pm

Yes, I converted to int8 model in a 3090 gpu and the int8 converted model is giving more speed than the fp16 model.

But the issue is still there when I try to do the int8 conversion in an orin nano 4gb

894492123 · May 9, 2024, 3:24am

I have same problems ，have you solved?

894492123 · May 9, 2024, 3:33am

hi，have you solved it?I have removed the cache which int8 engine created,but it still failed… or removed the related code?? thanks

894492123 · May 9, 2024, 3:41am

I also use 3090,but l use yolov5, the fp16 inf_speed was 0.9ms ,the int8 inf_speed was 0.7ms ; but creating int8_engine was failed on jetson

ashish-roopan · May 9, 2024, 6:28am

No the error is still there. Im sure that this is happening due to memory issue as it is working on a 8gb nano. The error displayed is not showing the real problem here.

894492123 · May 9, 2024, 8:01am

I soloved this just now,but it still works on the minmax calibration method,this is unbeliebvalbe, both 3090and jetson works with the method of minmaxcalibration ,but does not work by using entropycalibration; maybe entropycalibration is more stirct with model and inputs?

gabrielmstefanello058 · May 9, 2024, 6:25pm

Hello, did you use trtexec to create the engine? Could you please provide the script you used? I’m trying to create a int8 yolov8 model but i’m failling. I’m using a jetson orin nano with jetpack 6(TRT 8.6.2 and CUDA 12.2) and when I run the trtexec I get this error:

Error[10]: Could not find any implementation for node /model.22/proto/cv3/conv/Conv + PWN(PWN(/model.22/proto/cv3/act/Sigmoid), /model.22/proto/cv3/act/Mul).

gabrielmstefanello058 · May 9, 2024, 7:57pm

Uploading the output with --verbose to expose a bit more the error.
output.txt (3.1 MB)
Command that I used:
/usr/src/tensorrt/bin/trtexec --onnx=yolov8m-seg.onnx --saveEngine=yolov8m-int8-seg-b4-onnx.engine --optShapes=‘images’:4x3x480x480 --int8 --calib=“calibrations.txt” --verbose
where: calibrations.txt is a file with the path of 1000 images from the dataset

orfeasfil2000 · September 25, 2024, 3:26pm

Hey guys,

Any update on this?
I have the same error on yolov5large:

[09/25/2024-15:23:07] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node /model.1/conv/Conv + PWN(PWN(/model.1/act/Sigmoid), PWN(/model.1/act/Mul)).)

(for /model.0/* layer works fine lol its the same thing according to the layer structure…).

I have:

opset:17
tensorrt version: 10.4

@AakankshaS could you please help me on this?

2086825354 · September 29, 2024, 5:40am

are you solved

2086825354 · September 29, 2024, 6:11am

what version with torch

orfeasfil2000 · September 29, 2024, 7:51am

I managed to solve this by using IInt8EntropyCalibrator2, suggested calibrator for CNNs, instead of IInt8EntropyCalibrator. I think the problem was that the Int8EntropyCalibrator calibrates after Layer fusion and IInt8EntropyCalibrator2 calibrates before Layer fusion and this somehow cause a problem. More information in Tensorrt docs: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#enable_int8_c

Topic		Replies	Views
TensorRT fails to build FasterRCNN GIE model with using INT8 TensorRT	28	9268	May 3, 2018
Tenssorrt INT8 precision engine build failed for the models having custom layer (BatchedNMSDynamic_TRT) TensorRT	11	2022	June 29, 2021
Error Code 10: Internal Error (Could not find any implementation for node PWN(/model.0/act/Sigmoid).) Jetson Orin Nano tensorrt	4	686	February 26, 2024
TRT for yolov3: FP16 and INT8 optimization failed General	7	4441	October 12, 2021
Errors: tlt-export TLT YOLO model to INT8 calibration TAO Toolkit tensorrt , yolo	8	1166	October 12, 2021
TensorRT Python interface UFF int8 calibration issue TensorRT	15	2688	April 26, 2018
TensorRT run ONNX model with Int8 issue TensorRT	9	4338	October 12, 2021
TF-TRT INT8 Failing to convert due to no calibration TensorRT	3	1408	April 2, 2019
"Error Code 10: Internal Error" when running trtexec for int8 TensorRT cudnn	5	378	December 31, 2024
Int8 problem TensorRT tensorrt	19	1152	May 11, 2021

Error Code 10: Internal Error (Could not find any implementation for node

Description

Environment

Relevant Files

Steps To Reproduce

Related topics