Error Code 10: Internal Error (Could not find any implementation for node

Description

Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul).)

Reached above error when exporting PyTorch to TensorRT with INT8 flag set. I used the calibration class from tensorrt-utils/int8/calibration/ImagenetCalibrator.py at master · rmccorm4/tensorrt-utils · GitHub

I’m exporting using Ultralytics (Yolov8) except they don’t support Int8 so I copied the Exporter class and implemented it myself just like in tensorrt-utils above. It works for fp16 but not int8.

Tried every suggestion I found online for similar errors. Tried increasing workspace size. It doesn’t change, still uses at most 15 GB RAM. Tried removing workspace size setting so default is used, same thing. Tried adding tensorrt to LD_LIBRARY_PATH, no effect.

Suggestions? Thanks.

Environment

TensorRT Version: 8.5.2.2
GPU Type: Ampere (AGX Orin)
Nvidia Driver Version:
CUDA Version: 11.4
CUDNN Version:
Operating System + Version: Ubuntu 20.0.4
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.0.0
Baremetal or Container (if container which image + tag): Baremetal

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

[01/23/2024-15:39:58] [TRT] [I] [GpuLayer] COPY: /model.22/Mul_2_output_0 copy
[01/23/2024-15:39:58] [TRT] [I] [GpuLayer] COPY: /model.22/Sigmoid_output_0 copy
[01/23/2024-15:39:58] [TRT] [I] [GpuLayer] COPY: /model.22/Concat_output_0 copy
[01/23/2024-15:39:58] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +34, GPU +88, now: CPU 3123, GPU 15213 (MiB)
[01/23/2024-15:39:58] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +36, GPU +76, now: CPU 3159, GPU 15289 (MiB)
[01/23/2024-15:39:58] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x5dcda6f3b1eea89a due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x6c9b9925c4cc67b0 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x4798bd5eea3be0d6 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xfbca5e767c4ed4f2 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xfdf7509af98902e0 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x214bdfa026549ff2 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xc985777c89c6b3a4 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x6176c23707257237 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x01cd56dfbdb5c0ee due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x00c7d39818f4aff2 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x552ac687d7891695 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xad8a45d1c06da185 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x6fd15a9d85252b17 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x57f2a1d1b8552d02 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xafad4a0ea10d6400 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x2f5bc3e6bb27ae43 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x179844a379940fc2 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x698ab7d6de17ffeb due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xc722efd60bc6ea84 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x3ac8602b2543f50d due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0xbd976ef514eaa406 due to exception Cask convolution execution
[01/23/2024-15:48:37] [TRT] [W] Skipping tactic 0x7251b68d123da92b due to exception Cask convolution execution
[01/23/2024-15:48:38] [TRT] [E] 10: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul).)
TensorRT: export failure ❌ 543.7s: enter
Traceback (most recent call last):
File “tensorrt_export.py”, line 33, in
main()
File “tensorrt_export.py”, line 29, in main
export(model=model, format=‘engine’, imgsz=(960,1280), workspace=50, int8=True)
File “tensorrt_export.py”, line 16, in export
return Exporter(overrides=args, _callbacks=model.callbacks)(model=model.model)
File “/home/nvidia/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py”, line 115, in decorate_context
return func(*args, **kwargs)
File “/home/nvidia/zach/yolo-jetson/int8_exporter.py”, line 254, in call
f[1], _ = self.export_engine()
File “/home/nvidia/zach/yolo-jetson/int8_exporter.py”, line 124, in outer_func
raise e
File “/home/nvidia/zach/yolo-jetson/int8_exporter.py”, line 119, in outer_func
f, model = inner_func(*args, **kwargs)
File “/home/nvidia/zach/yolo-jetson/int8_exporter.py”, line 421, in export_engine
with builder.build_engine(network, config) as engine, open(f, ‘wb’) as t:
AttributeError: enter

Can you pls confirm which opset you are using ?
While parsing your model using trtexec command, you can see the opset in the logs.

For sigmoid you would need opset 17.

Thanks for the quick reply. I’m not using trtexec, rather the Builder object in Python. I’ll include the function below. It is mostly from Ultralytics and it works for FP16. I only added the calibrator and config.set_flag(trt.BuilderFlag.INT8). Calibrator itself seems to have no issues.

Summary
@try_export
    def export_engine(self, prefix=colorstr('TensorRT:')):
        """YOLOv8 TensorRT export https://developer.nvidia.com/tensorrt."""
        assert self.im.device.type != 'cpu', "export running on CPU but must be on GPU, i.e. use 'device=0'"
        try:
            import tensorrt as trt  # noqa
        except ImportError:
            if LINUX:
                check_requirements('nvidia-tensorrt', cmds='-U --index-url https://pypi.ngc.nvidia.com')
            import tensorrt as trt  # noqa

        check_version(trt.__version__, '7.0.0', hard=True)  # require tensorrt>=7.0.0
        self.args.simplify = True
        f_onnx, _ = self.export_onnx()

        LOGGER.info(f'\n{prefix} starting export with TensorRT {trt.__version__}...')
        assert Path(f_onnx).exists(), f'failed to export ONNX file: {f_onnx}'
        f = self.file.with_suffix('.engine')  # TensorRT engine file
        logger = trt.Logger(trt.Logger.INFO)
        if self.args.verbose:
            logger.min_severity = trt.Logger.Severity.VERBOSE

        builder = trt.Builder(logger)
        config = builder.create_builder_config()
        config.max_workspace_size = self.args.workspace * 1 << 30
        #config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, self.args.workspace << 30)  # fix TRT 8.4 deprecation notice

        flag = (1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
        network = builder.create_network(flag)
        parser = trt.OnnxParser(network, logger)
        if not parser.parse_from_file(f_onnx):
            raise RuntimeError(f'failed to load ONNX file: {f_onnx}')

        inputs = [network.get_input(i) for i in range(network.num_inputs)]
        outputs = [network.get_output(i) for i in range(network.num_outputs)]
        for inp in inputs:
            LOGGER.info(f'{prefix} input "{inp.name}" with shape{inp.shape} {inp.dtype}')
        for out in outputs:
            LOGGER.info(f'{prefix} output "{out.name}" with shape{out.shape} {out.dtype}')

        if self.args.dynamic:
            shape = self.im.shape
            if shape[0] <= 1:
                LOGGER.warning(f"{prefix} WARNING ⚠️ 'dynamic=True' model requires max batch size, i.e. 'batch=16'")
            profile = builder.create_optimization_profile()
            for inp in inputs:
                profile.set_shape(inp.name, (1, *shape[1:]), (max(1, shape[0] // 2), *shape[1:]), shape)
            config.add_optimization_profile(profile)

        dtype = "FP32"
        if builder.platform_has_fast_fp16 and self.args.half:
            dtype = "FP16"
            config.set_flag(trt.BuilderFlag.FP16)
        elif builder.platform_has_fast_int8 and self.args.int8:
            from calibrator import ImagenetCalibrator, get_int8_calibrator
            dtype = "INT8"
            config.set_flag(trt.BuilderFlag.INT8)
            config.int8_calibrator = get_int8_calibrator(calib_cache="calibration.cache", calib_data="./ims2/", \
                                                         max_calib_size=16, preprocess_func_name='preprocess_yolo', calib_batch_size=8)
        LOGGER.info(
            f'{prefix} building {dtype} engine as {f}')

        del self.model
        torch.cuda.empty_cache()

        # Write file
        with builder.build_engine(network, config) as engine, open(f, 'wb') as t:
            # Metadata
            meta = json.dumps(self.metadata)
            t.write(len(meta).to_bytes(4, byteorder='little', signed=True))
            t.write(meta.encode())
            # Model
            t.write(engine.serialize())

        return f, None

Ah, I found it anyways.

ONNX: starting export with onnx 1.15.0 opset 17…

I tried enabling both fp16 and int8 (previously only int8 enabled). Same error. Also the error occurs on the very first layer. I think it might have to do with PWN. Here is the full output so you can see the layers and some warnings. The only one that seems like it could cause the issue is “The CUDA context changed between createInferBuilder and buildSerializedNetwork. A Builder holds CUDA resources which cannot be shared across CUDA contexts, so access these in different CUDA context results in undefined behavior. If using pycuda, try import pycuda.autoinit before importing tensorrt.”

Summary
YOLOv8x-seg summary (fused): 295 layers, 71724508 parameters, 0 gradients, 343.7 GFLOPs

PyTorch: starting from 'oct_sliced_9_1_2023.pt' with input shape (1, 3, 960, 1280) BCHW and output shape(s) ((1, 40, 25200), (1, 32, 240, 320)) (137.4 MB)

ONNX: starting export with onnx 1.15.0 opset 17...
============ Diagnostic Run torch.onnx.export version 2.0.0+nv23.05 ============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

ONNX: simplifying with onnxsim 0.4.33...
ONNX: export success ✅ 7.8s, saved as 'oct_sliced_9_1_2023.onnx' (137.0 MB)

TensorRT: starting export with TensorRT 8.5.2.2...
[01/24/2024-08:55:37] [TRT] [I] [MemUsageChange] Init CUDA: CPU +215, GPU +0, now: CPU 1990, GPU 15055 (MiB)
[01/24/2024-08:55:39] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +303, GPU +149, now: CPU 2315, GPU 15203 (MiB)
[01/24/2024-08:55:39] [TRT] [I] ----------------------------------------------------------------
[01/24/2024-08:55:39] [TRT] [I] Input filename:   oct_sliced_9_1_2023.onnx
[01/24/2024-08:55:39] [TRT] [I] ONNX IR version:  0.0.8
[01/24/2024-08:55:39] [TRT] [I] Opset version:    17
[01/24/2024-08:55:39] [TRT] [I] Producer name:    pytorch
[01/24/2024-08:55:39] [TRT] [I] Producer version: 2.0.0
[01/24/2024-08:55:39] [TRT] [I] Domain:           
[01/24/2024-08:55:39] [TRT] [I] Model version:    0
[01/24/2024-08:55:39] [TRT] [I] Doc string:       
[01/24/2024-08:55:39] [TRT] [I] ----------------------------------------------------------------
[01/24/2024-08:55:39] [TRT] [W] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
TensorRT: input "images" with shape(1, 3, 960, 1280) DataType.HALF
TensorRT: output "output0" with shape(1, 40, 25200) DataType.HALF
TensorRT: output "output1" with shape(1, 32, 240, 320) DataType.HALF
2024-01-24 08:55:39 - calibrator - INFO - Skipping calibration files, using calibration cache: calibration.cache
TensorRT: building INT8 engine as oct_sliced_9_1_2023.engine
[01/24/2024-08:55:41] [TRT] [W] The CUDA context changed between createInferBuilder and buildSerializedNetwork. A Builder holds CUDA resources which cannot be shared across CUDA contexts, so access these in different CUDA context results in undefined behavior. If using pycuda, try import pycuda.autoinit before importing tensorrt.
2024-01-24 08:55:41 - calibrator - INFO - Using calibration cache to save time: calibration.cache
[01/24/2024-08:55:41] [TRT] [I] Reading Calibration Cache for calibrator: EntropyCalibration2
[01/24/2024-08:55:41] [TRT] [I] Generated calibration scales using calibration cache. Make sure that calibration cache has latest scales.
[01/24/2024-08:55:41] [TRT] [I] To regenerate calibration cache, please delete the existing one. TensorRT will generate a new calibration cache.
2024-01-24 08:55:41 - calibrator - INFO - Using calibration cache to save time: calibration.cache
[01/24/2024-08:55:41] [TRT] [W] Missing scale and zero-point for tensor /model.22/dfl/Softmax_output_0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[01/24/2024-08:55:41] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer* 405) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[01/24/2024-08:55:41] [TRT] [I] ---------- Layers Running on DLA ----------
[01/24/2024-08:55:41] [TRT] [I] ---------- Layers Running on GPU ----------
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.1/conv/Conv + PWN(PWN(/model.1/act/Sigmoid), /model.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/cv1/conv/Conv + PWN(PWN(/model.2/cv1/act/Sigmoid), /model.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.0/cv1/conv/Conv + PWN(PWN(/model.2/m.0/cv1/act/Sigmoid), /model.2/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.0/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.2/m.0/cv2/act/Sigmoid), /model.2/m.0/cv2/act/Mul), /model.2/m.0/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.1/cv1/conv/Conv + PWN(PWN(/model.2/m.1/cv1/act/Sigmoid), /model.2/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.1/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.2/m.1/cv2/act/Sigmoid), /model.2/m.1/cv2/act/Mul), /model.2/m.1/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.2/cv1/conv/Conv + PWN(PWN(/model.2/m.2/cv1/act/Sigmoid), /model.2/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/m.2/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.2/m.2/cv2/act/Sigmoid), /model.2/m.2/cv2/act/Mul), /model.2/m.2/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.2/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.2/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.2/m.0/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.2/m.1/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.2/cv2/conv/Conv + PWN(PWN(/model.2/cv2/act/Sigmoid), /model.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.3/conv/Conv + PWN(PWN(/model.3/act/Sigmoid), /model.3/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/cv1/conv/Conv + PWN(PWN(/model.4/cv1/act/Sigmoid), /model.4/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.0/cv1/conv/Conv + PWN(PWN(/model.4/m.0/cv1/act/Sigmoid), /model.4/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.0/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.0/cv2/act/Sigmoid), /model.4/m.0/cv2/act/Mul), /model.4/m.0/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.1/cv1/conv/Conv + PWN(PWN(/model.4/m.1/cv1/act/Sigmoid), /model.4/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.1/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.1/cv2/act/Sigmoid), /model.4/m.1/cv2/act/Mul), /model.4/m.1/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.2/cv1/conv/Conv + PWN(PWN(/model.4/m.2/cv1/act/Sigmoid), /model.4/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.2/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.2/cv2/act/Sigmoid), /model.4/m.2/cv2/act/Mul), /model.4/m.2/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.3/cv1/conv/Conv + PWN(PWN(/model.4/m.3/cv1/act/Sigmoid), /model.4/m.3/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.3/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.3/cv2/act/Sigmoid), /model.4/m.3/cv2/act/Mul), /model.4/m.3/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.4/cv1/conv/Conv + PWN(PWN(/model.4/m.4/cv1/act/Sigmoid), /model.4/m.4/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.4/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.4/cv2/act/Sigmoid), /model.4/m.4/cv2/act/Mul), /model.4/m.4/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.5/cv1/conv/Conv + PWN(PWN(/model.4/m.5/cv1/act/Sigmoid), /model.4/m.5/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/m.5/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.4/m.5/cv2/act/Sigmoid), /model.4/m.5/cv2/act/Mul), /model.4/m.5/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.0/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.1/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.2/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.3/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/m.4/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.4/cv2/conv/Conv + PWN(PWN(/model.4/cv2/act/Sigmoid), /model.4/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.5/conv/Conv + PWN(PWN(/model.5/act/Sigmoid), /model.5/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/cv1/conv/Conv + PWN(PWN(/model.6/cv1/act/Sigmoid), /model.6/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.0/cv1/conv/Conv + PWN(PWN(/model.6/m.0/cv1/act/Sigmoid), /model.6/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.0/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.0/cv2/act/Sigmoid), /model.6/m.0/cv2/act/Mul), /model.6/m.0/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.1/cv1/conv/Conv + PWN(PWN(/model.6/m.1/cv1/act/Sigmoid), /model.6/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.1/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.1/cv2/act/Sigmoid), /model.6/m.1/cv2/act/Mul), /model.6/m.1/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.2/cv1/conv/Conv + PWN(PWN(/model.6/m.2/cv1/act/Sigmoid), /model.6/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.2/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.2/cv2/act/Sigmoid), /model.6/m.2/cv2/act/Mul), /model.6/m.2/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.3/cv1/conv/Conv + PWN(PWN(/model.6/m.3/cv1/act/Sigmoid), /model.6/m.3/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.3/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.3/cv2/act/Sigmoid), /model.6/m.3/cv2/act/Mul), /model.6/m.3/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.4/cv1/conv/Conv + PWN(PWN(/model.6/m.4/cv1/act/Sigmoid), /model.6/m.4/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.4/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.4/cv2/act/Sigmoid), /model.6/m.4/cv2/act/Mul), /model.6/m.4/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.5/cv1/conv/Conv + PWN(PWN(/model.6/m.5/cv1/act/Sigmoid), /model.6/m.5/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/m.5/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.6/m.5/cv2/act/Sigmoid), /model.6/m.5/cv2/act/Mul), /model.6/m.5/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.0/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.1/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.2/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.3/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/m.4/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.6/cv2/conv/Conv + PWN(PWN(/model.6/cv2/act/Sigmoid), /model.6/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.7/conv/Conv + PWN(PWN(/model.7/act/Sigmoid), /model.7/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/cv1/conv/Conv + PWN(PWN(/model.8/cv1/act/Sigmoid), /model.8/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.0/cv1/conv/Conv + PWN(PWN(/model.8/m.0/cv1/act/Sigmoid), /model.8/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.0/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.8/m.0/cv2/act/Sigmoid), /model.8/m.0/cv2/act/Mul), /model.8/m.0/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.1/cv1/conv/Conv + PWN(PWN(/model.8/m.1/cv1/act/Sigmoid), /model.8/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.1/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.8/m.1/cv2/act/Sigmoid), /model.8/m.1/cv2/act/Mul), /model.8/m.1/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.2/cv1/conv/Conv + PWN(PWN(/model.8/m.2/cv1/act/Sigmoid), /model.8/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/m.2/cv2/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(PWN(/model.8/m.2/cv2/act/Sigmoid), /model.8/m.2/cv2/act/Mul), /model.8/m.2/Add)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.8/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.8/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.8/m.0/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.8/m.1/Add_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.8/cv2/conv/Conv + PWN(PWN(/model.8/cv2/act/Sigmoid), /model.8/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.9/cv1/conv/Conv + PWN(PWN(/model.9/cv1/act/Sigmoid), /model.9/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POOLING: /model.9/m/MaxPool
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POOLING: /model.9/m_1/MaxPool
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POOLING: /model.9/m_2/MaxPool
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.9/cv1/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.9/m/MaxPool_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.9/m_1/MaxPool_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.9/cv2/conv/Conv + PWN(PWN(/model.9/cv2/act/Sigmoid), /model.9/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] RESIZE: /model.10/Resize
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.10/Resize_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.6/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/cv1/conv/Conv + PWN(PWN(/model.12/cv1/act/Sigmoid), /model.12/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.0/cv1/conv/Conv + PWN(PWN(/model.12/m.0/cv1/act/Sigmoid), /model.12/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.0/cv2/conv/Conv + PWN(PWN(/model.12/m.0/cv2/act/Sigmoid), /model.12/m.0/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.1/cv1/conv/Conv + PWN(PWN(/model.12/m.1/cv1/act/Sigmoid), /model.12/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.1/cv2/conv/Conv + PWN(PWN(/model.12/m.1/cv2/act/Sigmoid), /model.12/m.1/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.2/cv1/conv/Conv + PWN(PWN(/model.12/m.2/cv1/act/Sigmoid), /model.12/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/m.2/cv2/conv/Conv + PWN(PWN(/model.12/m.2/cv2/act/Sigmoid), /model.12/m.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/m.0/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/m.1/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.12/cv2/conv/Conv + PWN(PWN(/model.12/cv2/act/Sigmoid), /model.12/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] RESIZE: /model.13/Resize
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.13/Resize_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.4/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/cv1/conv/Conv + PWN(PWN(/model.15/cv1/act/Sigmoid), /model.15/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.0/cv1/conv/Conv + PWN(PWN(/model.15/m.0/cv1/act/Sigmoid), /model.15/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.0/cv2/conv/Conv + PWN(PWN(/model.15/m.0/cv2/act/Sigmoid), /model.15/m.0/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.1/cv1/conv/Conv + PWN(PWN(/model.15/m.1/cv1/act/Sigmoid), /model.15/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.1/cv2/conv/Conv + PWN(PWN(/model.15/m.1/cv2/act/Sigmoid), /model.15/m.1/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.2/cv1/conv/Conv + PWN(PWN(/model.15/m.2/cv1/act/Sigmoid), /model.15/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/m.2/cv2/conv/Conv + PWN(PWN(/model.15/m.2/cv2/act/Sigmoid), /model.15/m.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.15/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.15/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.15/m.0/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.15/m.1/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.15/cv2/conv/Conv + PWN(PWN(/model.15/cv2/act/Sigmoid), /model.15/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.16/conv/Conv + PWN(PWN(/model.16/act/Sigmoid), /model.16/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.12/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/cv1/conv/Conv + PWN(PWN(/model.18/cv1/act/Sigmoid), /model.18/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.0/cv1/conv/Conv + PWN(PWN(/model.18/m.0/cv1/act/Sigmoid), /model.18/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.0/cv2/conv/Conv + PWN(PWN(/model.18/m.0/cv2/act/Sigmoid), /model.18/m.0/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.1/cv1/conv/Conv + PWN(PWN(/model.18/m.1/cv1/act/Sigmoid), /model.18/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.1/cv2/conv/Conv + PWN(PWN(/model.18/m.1/cv2/act/Sigmoid), /model.18/m.1/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.2/cv1/conv/Conv + PWN(PWN(/model.18/m.2/cv1/act/Sigmoid), /model.18/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/m.2/cv2/conv/Conv + PWN(PWN(/model.18/m.2/cv2/act/Sigmoid), /model.18/m.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.18/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.18/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.18/m.0/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.18/m.1/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.18/cv2/conv/Conv + PWN(PWN(/model.18/cv2/act/Sigmoid), /model.18/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.19/conv/Conv + PWN(PWN(/model.19/act/Sigmoid), /model.19/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.9/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/cv1/conv/Conv + PWN(PWN(/model.21/cv1/act/Sigmoid), /model.21/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.0/cv1/conv/Conv + PWN(PWN(/model.21/m.0/cv1/act/Sigmoid), /model.21/m.0/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.0/cv2/conv/Conv + PWN(PWN(/model.21/m.0/cv2/act/Sigmoid), /model.21/m.0/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.1/cv1/conv/Conv + PWN(PWN(/model.21/m.1/cv1/act/Sigmoid), /model.21/m.1/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.1/cv2/conv/Conv + PWN(PWN(/model.21/m.1/cv2/act/Sigmoid), /model.21/m.1/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.2/cv1/conv/Conv + PWN(PWN(/model.21/m.2/cv1/act/Sigmoid), /model.21/m.2/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/m.2/cv2/conv/Conv + PWN(PWN(/model.21/m.2/cv2/act/Sigmoid), /model.21/m.2/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.21/Split_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.21/Split_output_1 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.21/m.0/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.21/m.1/cv2/act/Mul_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.21/cv2/conv/Conv + PWN(PWN(/model.21/cv2/act/Sigmoid), /model.21/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/proto/cv1/conv/Conv + PWN(PWN(/model.22/proto/cv1/act/Sigmoid), /model.22/proto/cv1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] DECONVOLUTION: /model.22/proto/upsample/ConvTranspose
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/proto/cv2/conv/Conv + PWN(PWN(/model.22/proto/cv2/act/Sigmoid), /model.22/proto/cv2/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/proto/cv3/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(PWN(/model.22/proto/cv3/act/Sigmoid), /model.22/proto/cv3/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.0/cv4.0.0/conv/Conv + PWN(PWN(/model.22/cv4.0/cv4.0.0/act/Sigmoid), /model.22/cv4.0/cv4.0.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.0/cv4.0.1/conv/Conv + PWN(PWN(/model.22/cv4.0/cv4.0.1/act/Sigmoid), /model.22/cv4.0/cv4.0.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.0/cv4.0.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.1/cv4.1.0/conv/Conv + PWN(PWN(/model.22/cv4.1/cv4.1.0/act/Sigmoid), /model.22/cv4.1/cv4.1.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.1/cv4.1.1/conv/Conv + PWN(PWN(/model.22/cv4.1/cv4.1.1/act/Sigmoid), /model.22/cv4.1/cv4.1.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.1/cv4.1.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_1
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_1_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.2/cv4.2.0/conv/Conv + PWN(PWN(/model.22/cv4.2/cv4.2.0/act/Sigmoid), /model.22/cv4.2/cv4.2.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.2/cv4.2.1/conv/Conv + PWN(PWN(/model.22/cv4.2/cv4.2.1/act/Sigmoid), /model.22/cv4.2/cv4.2.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv4.2/cv4.2.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_2
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_2_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.0/cv2.0.0/conv/Conv + PWN(PWN(/model.22/cv2.0/cv2.0.0/act/Sigmoid), /model.22/cv2.0/cv2.0.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.0/cv2.0.1/conv/Conv + PWN(PWN(/model.22/cv2.0/cv2.0.1/act/Sigmoid), /model.22/cv2.0/cv2.0.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.0/cv2.0.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.0/cv3.0.0/conv/Conv + PWN(PWN(/model.22/cv3.0/cv3.0.0/act/Sigmoid), /model.22/cv3.0/cv3.0.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.0/cv3.0.1/conv/Conv + PWN(PWN(/model.22/cv3.0/cv3.0.1/act/Sigmoid), /model.22/cv3.0/cv3.0.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.0/cv3.0.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.1/cv2.1.0/conv/Conv + PWN(PWN(/model.22/cv2.1/cv2.1.0/act/Sigmoid), /model.22/cv2.1/cv2.1.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.1/cv2.1.1/conv/Conv + PWN(PWN(/model.22/cv2.1/cv2.1.1/act/Sigmoid), /model.22/cv2.1/cv2.1.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.1/cv2.1.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.1/cv3.1.0/conv/Conv + PWN(PWN(/model.22/cv3.1/cv3.1.0/act/Sigmoid), /model.22/cv3.1/cv3.1.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.1/cv3.1.1/conv/Conv + PWN(PWN(/model.22/cv3.1/cv3.1.1/act/Sigmoid), /model.22/cv3.1/cv3.1.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.1/cv3.1.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.2/cv2.2.0/conv/Conv + PWN(PWN(/model.22/cv2.2/cv2.2.0/act/Sigmoid), /model.22/cv2.2/cv2.2.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.2/cv2.2.1/conv/Conv + PWN(PWN(/model.22/cv2.2/cv2.2.1/act/Sigmoid), /model.22/cv2.2/cv2.2.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv2.2/cv2.2.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.2/cv3.2.0/conv/Conv + PWN(PWN(/model.22/cv3.2/cv3.2.0/act/Sigmoid), /model.22/cv3.2/cv3.2.0/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.2/cv3.2.1/conv/Conv + PWN(PWN(/model.22/cv3.2/cv3.2.1/act/Sigmoid), /model.22/cv3.2/cv3.2.1/act/Mul)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/cv3.2/cv3.2.2/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_3
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_3_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_4
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_4_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/Reshape_5
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Reshape_5_copy_output
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/dfl/Reshape + /model.22/dfl/Transpose
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SOFTMAX: /model.22/dfl/Softmax
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONVOLUTION: /model.22/dfl/conv/Conv
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] SHUFFLE: /model.22/dfl/Reshape_1
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONSTANT: /model.22/Constant_12_output_0
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONSTANT: /model.22/Constant_12_output_0_clone_1
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] ELEMENTWISE: /model.22/Sub
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(/model.22/Add_1)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(/model.22/Constant_14_output_0 + (Unnamed Layer* 406) [Shuffle], PWN(/model.22/Add_2, /model.22/Div_1))
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] ELEMENTWISE: /model.22/Sub_1
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] CONSTANT: /model.22/Constant_15_output_0 + (Unnamed Layer* 411) [Shuffle]
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] ELEMENTWISE: /model.22/Mul_2
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] POINTWISE: PWN(/model.22/Sigmoid)
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Mul_2_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Sigmoid_output_0 copy
[01/24/2024-08:55:41] [TRT] [I] [GpuLayer] COPY: /model.22/Concat_output_0 copy
[01/24/2024-08:55:42] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +34, GPU +116, now: CPU 2848, GPU 15590 (MiB)
[01/24/2024-08:55:42] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +36, GPU +121, now: CPU 2884, GPU 15711 (MiB)
[01/24/2024-08:55:42] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0x5dcda6f3b1eea89a due to exception Cask convolution execution
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0x6c9b9925c4cc67b0 due to exception Cask convolution execution
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0x4798bd5eea3be0d6 due to exception Cask convolution execution
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0xfbca5e767c4ed4f2 due to exception Cask convolution execution
[01/24/2024-09:38:00] [TRT] [W] Skipping tactic 0xfdf7509af98902e0 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x214bdfa026549ff2 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xc985777c89c6b3a4 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x6176c23707257237 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x01cd56dfbdb5c0ee due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x00c7d39818f4aff2 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x552ac687d7891695 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xad8a45d1c06da185 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x6fd15a9d85252b17 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x57f2a1d1b8552d02 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xafad4a0ea10d6400 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x2f5bc3e6bb27ae43 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x179844a379940fc2 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x698ab7d6de17ffeb due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xc722efd60bc6ea84 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x3ac8602b2543f50d due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0xbd976ef514eaa406 due to exception Cask convolution execution
[01/24/2024-09:38:01] [TRT] [W] Skipping tactic 0x7251b68d123da92b due to exception Cask convolution execution
[01/24/2024-09:38:02] [TRT] [E] 10: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul).)
TensorRT: export failure ❌ 2553.1s: __enter__
Traceback (most recent call last):
  File "tensorrt_export.py", line 33, in <module>
    main()
  File "tensorrt_export.py", line 29, in main
    export(model=model, format='engine', imgsz=(960,1280), workspace=24, half=True, int8=True)
  File "tensorrt_export.py", line 16, in export
    return Exporter(overrides=args, _callbacks=model.callbacks)(model=model.model)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/nvidia/zach/yolo-jetson/int8_exporter.py", line 254, in __call__
    f[1], _ = self.export_engine()
  File "/home/nvidia/zach/yolo-jetson/int8_exporter.py", line 124, in outer_func
    raise e
  File "/home/nvidia/zach/yolo-jetson/int8_exporter.py", line 119, in outer_func
    f, model = inner_func(*args, **kwargs)
  File "/home/nvidia/zach/yolo-jetson/int8_exporter.py", line 421, in export_engine
    with builder.build_engine(network, config) as engine, open(f, 'wb') as t:
AttributeError: __enter__

Hi @alsozatch ,
Checking on this.

I am also getting the same error, saying that there is no implementation for /model.0/conv/Conv + PWN(PWN(/model.0/act/Sigmoid), /model.0/act/Mul).).

I used the this script to do the export. TensorRT-For-YOLO-Series/export.py at main · Linaom1214/TensorRT-For-YOLO-Series · GitHub

Here are the things I found.

  1. The code works for fp32 and fp16 and only fails if we do int8 caliberation with a caliberation.cache file.
  2. The code works for all fp32 ,fp16 and in8 in trtexec without doing a caliberation.(I tested it only without doing any caliberation)
  3. The code works if we do int8 caliberation without giving cache.

I think the error was happening due to the wrong caliberation.cache file. Its working fine when I created a new cache.

Thanks for the info. When you say it’s working fine, have you checked if it actually provides speedup over fp16? I’m currently using the yolo export since I want the file saved as a .engine which can be loaded directly into ultralytics again (output of trtexec doesn’t have attached metadata and can’t be loaded into yolo models after quantization). I might try that repo but I can’t really compare the results of that to my previous tests because of the different format I think.

Yes, I converted to int8 model in a 3090 gpu and the int8 converted model is giving more speed than the fp16 model.

But the issue is still there when I try to do the int8 conversion in an orin nano 4gb