&&&& RUNNING TensorRT.trtexec [TensorRT v8201] # trtexec --onnx=pyt_test_3d.onnx --int8 --workspace=8192 --noBuilderCache --verbose [02/14/2022-02:18:35] [I] === Model Options === [02/14/2022-02:18:35] [I] Format: ONNX [02/14/2022-02:18:35] [I] Model: pyt_test_3d.onnx [02/14/2022-02:18:35] [I] Output: [02/14/2022-02:18:35] [I] === Build Options === [02/14/2022-02:18:35] [I] Max batch: explicit batch [02/14/2022-02:18:35] [I] Workspace: 8192 MiB [02/14/2022-02:18:35] [I] minTiming: 1 [02/14/2022-02:18:35] [I] avgTiming: 8 [02/14/2022-02:18:35] [I] Precision: FP32+INT8 [02/14/2022-02:18:35] [I] Calibration: Dynamic [02/14/2022-02:18:35] [I] Refit: Disabled [02/14/2022-02:18:35] [I] Sparsity: Disabled [02/14/2022-02:18:35] [I] Safe mode: Disabled [02/14/2022-02:18:35] [I] DirectIO mode: Disabled [02/14/2022-02:18:35] [I] Restricted mode: Disabled [02/14/2022-02:18:35] [I] Save engine: [02/14/2022-02:18:35] [I] Load engine: [02/14/2022-02:18:35] [I] Profiling verbosity: 0 [02/14/2022-02:18:35] [I] Tactic sources: Using default tactic sources [02/14/2022-02:18:35] [I] timingCacheMode: disable [02/14/2022-02:18:35] [I] timingCacheFile: [02/14/2022-02:18:35] [I] Input(s)s format: fp32:CHW [02/14/2022-02:18:35] [I] Output(s)s format: fp32:CHW [02/14/2022-02:18:35] [I] Input build shapes: model [02/14/2022-02:18:35] [I] Input calibration shapes: model [02/14/2022-02:18:35] [I] === System Options === [02/14/2022-02:18:35] [I] Device: 0 [02/14/2022-02:18:35] [I] DLACore: [02/14/2022-02:18:35] [I] Plugins: [02/14/2022-02:18:35] [I] === Inference Options === [02/14/2022-02:18:35] [I] Batch: Explicit [02/14/2022-02:18:35] [I] Input inference shapes: model [02/14/2022-02:18:35] [I] Iterations: 10 [02/14/2022-02:18:35] [I] Duration: 3s (+ 200ms warm up) [02/14/2022-02:18:35] [I] Sleep time: 0ms [02/14/2022-02:18:35] [I] Idle time: 0ms [02/14/2022-02:18:35] [I] Streams: 1 [02/14/2022-02:18:35] [I] ExposeDMA: Disabled [02/14/2022-02:18:35] [I] Data transfers: Enabled [02/14/2022-02:18:35] [I] Spin-wait: Disabled [02/14/2022-02:18:35] [I] Multithreading: Disabled [02/14/2022-02:18:35] [I] CUDA Graph: Disabled [02/14/2022-02:18:35] [I] Separate profiling: Disabled [02/14/2022-02:18:35] [I] Time Deserialize: Disabled [02/14/2022-02:18:35] [I] Time Refit: Disabled [02/14/2022-02:18:35] [I] Skip inference: Disabled [02/14/2022-02:18:35] [I] Inputs: [02/14/2022-02:18:35] [I] === Reporting Options === [02/14/2022-02:18:35] [I] Verbose: Enabled [02/14/2022-02:18:35] [I] Averages: 10 inferences [02/14/2022-02:18:35] [I] Percentile: 99 [02/14/2022-02:18:35] [I] Dump refittable layers:Disabled [02/14/2022-02:18:35] [I] Dump output: Disabled [02/14/2022-02:18:35] [I] Profile: Disabled [02/14/2022-02:18:35] [I] Export timing to JSON file: [02/14/2022-02:18:35] [I] Export output to JSON file: [02/14/2022-02:18:35] [I] Export profile to JSON file: [02/14/2022-02:18:35] [I] [02/14/2022-02:18:35] [I] === Device Information === [02/14/2022-02:18:35] [I] Selected Device: NVIDIA A100-PCIE-40GB [02/14/2022-02:18:35] [I] Compute Capability: 8.0 [02/14/2022-02:18:35] [I] SMs: 108 [02/14/2022-02:18:35] [I] Compute Clock Rate: 1.41 GHz [02/14/2022-02:18:35] [I] Device Global Memory: 40354 MiB [02/14/2022-02:18:35] [I] Shared Memory per SM: 164 KiB [02/14/2022-02:18:35] [I] Memory Bus Width: 5120 bits (ECC enabled) [02/14/2022-02:18:35] [I] Memory Clock Rate: 1.215 GHz [02/14/2022-02:18:35] [I] [02/14/2022-02:18:35] [I] TensorRT version: 8.2.1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::BatchTilePlugin_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::CoordConvAC version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::CropAndResize version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::CropAndResizeDynamic version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::EfficientNMS_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::EfficientNMS_TFTRT_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::GenerateDetection_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::MultilevelProposeROI_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::NMSDynamic_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::Proposal version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::ProposalDynamic version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::Region_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::ScatterND version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1 [02/14/2022-02:18:35] [V] [TRT] Registered plugin creator - ::Split version 1 [02/14/2022-02:18:36] [I] [TRT] [MemUsageChange] Init CUDA: CPU +425, GPU +0, now: CPU 437, GPU 3952 (MiB) [02/14/2022-02:18:37] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 437 MiB, GPU 3952 MiB [02/14/2022-02:18:37] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 654 MiB, GPU 4024 MiB [02/14/2022-02:18:37] [I] Start parsing network model [02/14/2022-02:18:37] [I] [TRT] ---------------------------------------------------------------- [02/14/2022-02:18:37] [I] [TRT] Input filename: pyt_test_3d.onnx [02/14/2022-02:18:37] [I] [TRT] ONNX IR version: 0.0.7 [02/14/2022-02:18:37] [I] [TRT] Opset version: 13 [02/14/2022-02:18:37] [I] [TRT] Producer name: pytorch [02/14/2022-02:18:37] [I] [TRT] Producer version: 1.11 [02/14/2022-02:18:37] [I] [TRT] Domain: [02/14/2022-02:18:37] [I] [TRT] Model version: 0 [02/14/2022-02:18:37] [I] [TRT] Doc string: [02/14/2022-02:18:37] [I] [TRT] ---------------------------------------------------------------- [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::BatchTilePlugin_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::BatchedNMS_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::BatchedNMSDynamic_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::CoordConvAC version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::CropAndResize version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::CropAndResizeDynamic version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::DetectionLayer_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::EfficientNMS_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::EfficientNMS_ONNX_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::EfficientNMS_TFTRT_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::FlattenConcat_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::GenerateDetection_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::GridAnchor_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::GridAnchorRect_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::LReLU_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::MultilevelCropAndResize_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::MultilevelProposeROI_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::NMS_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::NMSDynamic_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::Normalize_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::PriorBox_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::ProposalLayer_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::Proposal version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::ProposalDynamic version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::Region_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::Reorg_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::ResizeNearest_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::RPROI_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::ScatterND version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::SpecialSlice_TRT version 1 [02/14/2022-02:18:37] [V] [TRT] Plugin creator already registered - ::Split version 1 [02/14/2022-02:18:37] [V] [TRT] Adding network input: inputs.1 with dtype: float32, dimensions: (1, 1, 128, 128, 128) [02/14/2022-02:18:37] [V] [TRT] Registering tensor: inputs.1 for ONNX tensor: inputs.1 [02/14/2022-02:18:37] [V] [TRT] Importing initializer: op.weight [02/14/2022-02:18:37] [V] [TRT] Importing initializer: op.bias [02/14/2022-02:18:37] [V] [TRT] Importing initializer: op1.weight [02/14/2022-02:18:37] [V] [TRT] Importing initializer: op1.bias [02/14/2022-02:18:37] [V] [TRT] Importing initializer: 29 [02/14/2022-02:18:37] [V] [TRT] Importing initializer: 30 [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_0 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_0 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_0 [Constant] outputs: [5 -> ()[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_1 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_1 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_1 [Constant] outputs: [6 -> ()[INT8]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: QuantizeLinear_2 [QuantizeLinear] [02/14/2022-02:18:37] [V] [TRT] Searching for input: inputs.1 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 5 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 6 [02/14/2022-02:18:37] [V] [TRT] QuantizeLinear_2 [QuantizeLinear] inputs: [inputs.1 -> (1, 1, 128, 128, 128)[FLOAT]], [5 -> ()[FLOAT]], [6 -> ()[INT8]], [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 7 for ONNX tensor: 7 [02/14/2022-02:18:37] [V] [TRT] QuantizeLinear_2 [QuantizeLinear] outputs: [7 -> (1, 1, 128, 128, 128)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_3 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_3 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_3 [Constant] outputs: [8 -> ()[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_4 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_4 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_4 [Constant] outputs: [9 -> ()[INT8]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: DequantizeLinear_5 [DequantizeLinear] [02/14/2022-02:18:37] [V] [TRT] Searching for input: 7 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 8 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 9 [02/14/2022-02:18:37] [V] [TRT] DequantizeLinear_5 [DequantizeLinear] inputs: [7 -> (1, 1, 128, 128, 128)[FLOAT]], [8 -> ()[FLOAT]], [9 -> ()[INT8]], [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 10 for ONNX tensor: 10 [02/14/2022-02:18:37] [V] [TRT] DequantizeLinear_5 [DequantizeLinear] outputs: [10 -> (1, 1, 128, 128, 128)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_6 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_6 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_6 [Constant] outputs: [11 -> (32)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: QuantizeLinear_7 [QuantizeLinear] [02/14/2022-02:18:37] [V] [TRT] Searching for input: op.weight [02/14/2022-02:18:37] [V] [TRT] Searching for input: 11 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 29 [02/14/2022-02:18:37] [V] [TRT] QuantizeLinear_7 [QuantizeLinear] inputs: [op.weight -> (32, 1, 3, 3, 3)[FLOAT]], [11 -> (32)[FLOAT]], [29 -> (32)[INT8]], [02/14/2022-02:18:37] [V] [TRT] Registering layer: op.weight for ONNX node: op.weight [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 14 for ONNX tensor: 14 [02/14/2022-02:18:37] [V] [TRT] QuantizeLinear_7 [QuantizeLinear] outputs: [14 -> (32, 1, 3, 3, 3)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: DequantizeLinear_8 [DequantizeLinear] [02/14/2022-02:18:37] [V] [TRT] Searching for input: 14 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 11 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 29 [02/14/2022-02:18:37] [V] [TRT] DequantizeLinear_8 [DequantizeLinear] inputs: [14 -> (32, 1, 3, 3, 3)[FLOAT]], [11 -> (32)[FLOAT]], [29 -> (32)[INT8]], [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 15 for ONNX tensor: 15 [02/14/2022-02:18:37] [V] [TRT] DequantizeLinear_8 [DequantizeLinear] outputs: [15 -> (32, 1, 3, 3, 3)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Conv_9 [Conv] [02/14/2022-02:18:37] [V] [TRT] Searching for input: 10 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 15 [02/14/2022-02:18:37] [V] [TRT] Searching for input: op.bias [02/14/2022-02:18:37] [V] [TRT] Conv_9 [Conv] inputs: [10 -> (1, 1, 128, 128, 128)[FLOAT]], [15 -> (32, 1, 3, 3, 3)[FLOAT]], [op.bias -> (32)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Convolution input dimensions: (1, 1, 128, 128, 128) [02/14/2022-02:18:37] [V] [TRT] Kernel weights are not set yet. Kernel weights must be set using setInput(1, kernel_tensor) API call. [02/14/2022-02:18:37] [V] [TRT] Registering layer: Conv_9 for ONNX node: Conv_9 [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 16 for ONNX tensor: 16 [02/14/2022-02:18:37] [V] [TRT] Conv_9 [Conv] outputs: [16 -> (1, 32, 128, 128, 128)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_10 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_10 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_10 [Constant] outputs: [17 -> ()[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_11 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_11 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_11 [Constant] outputs: [18 -> ()[INT8]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: QuantizeLinear_12 [QuantizeLinear] [02/14/2022-02:18:37] [V] [TRT] Searching for input: 16 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 17 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 18 [02/14/2022-02:18:37] [V] [TRT] QuantizeLinear_12 [QuantizeLinear] inputs: [16 -> (1, 32, 128, 128, 128)[FLOAT]], [17 -> ()[FLOAT]], [18 -> ()[INT8]], [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 19 for ONNX tensor: 19 [02/14/2022-02:18:37] [V] [TRT] QuantizeLinear_12 [QuantizeLinear] outputs: [19 -> (1, 32, 128, 128, 128)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_13 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_13 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_13 [Constant] outputs: [20 -> ()[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_14 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_14 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_14 [Constant] outputs: [21 -> ()[INT8]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: DequantizeLinear_15 [DequantizeLinear] [02/14/2022-02:18:37] [V] [TRT] Searching for input: 19 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 20 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 21 [02/14/2022-02:18:37] [V] [TRT] DequantizeLinear_15 [DequantizeLinear] inputs: [19 -> (1, 32, 128, 128, 128)[FLOAT]], [20 -> ()[FLOAT]], [21 -> ()[INT8]], [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 22 for ONNX tensor: 22 [02/14/2022-02:18:37] [V] [TRT] DequantizeLinear_15 [DequantizeLinear] outputs: [22 -> (1, 32, 128, 128, 128)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Constant_16 [Constant] [02/14/2022-02:18:37] [V] [TRT] Constant_16 [Constant] inputs: [02/14/2022-02:18:37] [V] [TRT] Constant_16 [Constant] outputs: [23 -> (64)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: QuantizeLinear_17 [QuantizeLinear] [02/14/2022-02:18:37] [V] [TRT] Searching for input: op1.weight [02/14/2022-02:18:37] [V] [TRT] Searching for input: 23 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 30 [02/14/2022-02:18:37] [V] [TRT] QuantizeLinear_17 [QuantizeLinear] inputs: [op1.weight -> (64, 32, 3, 3, 3)[FLOAT]], [23 -> (64)[FLOAT]], [30 -> (64)[INT8]], [02/14/2022-02:18:37] [V] [TRT] Registering layer: op1.weight for ONNX node: op1.weight [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 26 for ONNX tensor: 26 [02/14/2022-02:18:37] [V] [TRT] QuantizeLinear_17 [QuantizeLinear] outputs: [26 -> (64, 32, 3, 3, 3)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: DequantizeLinear_18 [DequantizeLinear] [02/14/2022-02:18:37] [V] [TRT] Searching for input: 26 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 23 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 30 [02/14/2022-02:18:37] [V] [TRT] DequantizeLinear_18 [DequantizeLinear] inputs: [26 -> (64, 32, 3, 3, 3)[FLOAT]], [23 -> (64)[FLOAT]], [30 -> (64)[INT8]], [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 27 for ONNX tensor: 27 [02/14/2022-02:18:37] [V] [TRT] DequantizeLinear_18 [DequantizeLinear] outputs: [27 -> (64, 32, 3, 3, 3)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Parsing node: Conv_19 [Conv] [02/14/2022-02:18:37] [V] [TRT] Searching for input: 22 [02/14/2022-02:18:37] [V] [TRT] Searching for input: 27 [02/14/2022-02:18:37] [V] [TRT] Searching for input: op1.bias [02/14/2022-02:18:37] [V] [TRT] Conv_19 [Conv] inputs: [22 -> (1, 32, 128, 128, 128)[FLOAT]], [27 -> (64, 32, 3, 3, 3)[FLOAT]], [op1.bias -> (64)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Convolution input dimensions: (1, 32, 128, 128, 128) [02/14/2022-02:18:37] [V] [TRT] Kernel weights are not set yet. Kernel weights must be set using setInput(1, kernel_tensor) API call. [02/14/2022-02:18:37] [V] [TRT] Registering layer: Conv_19 for ONNX node: Conv_19 [02/14/2022-02:18:37] [V] [TRT] Registering tensor: 28_0 for ONNX tensor: 28 [02/14/2022-02:18:37] [V] [TRT] Conv_19 [Conv] outputs: [28 -> (1, 64, 128, 128, 128)[FLOAT]], [02/14/2022-02:18:37] [V] [TRT] Marking 28_0 as output: 28 [02/14/2022-02:18:37] [I] Finish parsing network model [02/14/2022-02:18:37] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best [02/14/2022-02:18:38] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes. [02/14/2022-02:18:38] [V] [TRT] Applying generic optimizations to the graph for inference. [02/14/2022-02:18:38] [V] [TRT] Original: 28 layers [02/14/2022-02:18:38] [V] [TRT] After dead-layer removal: 28 layers [02/14/2022-02:18:38] [V] [TRT] QDQ graph optimizer - constant folding of Q/DQ initializers [02/14/2022-02:18:38] [V] [TRT] Running: ConstQDQInitializersFusion [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 1) [Constant] [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 0) [Constant] [02/14/2022-02:18:38] [V] [TRT] Running: ConstQDQInitializersFusion [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 22) [Constant] [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 21) [Constant] [02/14/2022-02:18:38] [V] [TRT] Running: ConstQDQInitializersFusion [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 11) [Constant] [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 10) [Constant] [02/14/2022-02:18:38] [V] [TRT] Running: ConstQDQInitializersFusion [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 15) [Constant] [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 14) [Constant] [02/14/2022-02:18:38] [V] [TRT] Running: ConstQDQInitializersFusion [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 8) [Constant] [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 7) [Constant] [02/14/2022-02:18:38] [V] [TRT] Running: ConstQDQInitializersFusion [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 4) [Constant] [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 3) [Constant] [02/14/2022-02:18:38] [V] [TRT] Running: ConstQDQInitializersFusion [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 25) [Constant] [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 24) [Constant] [02/14/2022-02:18:38] [V] [TRT] Running: ConstQDQInitializersFusion [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 18) [Constant] [02/14/2022-02:18:38] [V] [TRT] Removing (Unnamed Layer* 17) [Constant] [02/14/2022-02:18:38] [V] [TRT] After Myelin optimization: 12 layers [02/14/2022-02:18:38] [V] [TRT] QDQ graph optimizer - constant folding of Q/DQ initializers [02/14/2022-02:18:38] [V] [TRT] QDQ graph optimizer forward pass - DQ motions and fusions [02/14/2022-02:18:38] [V] [TRT] Running: ConstWeightsQuantizeFusion [02/14/2022-02:18:38] [V] [TRT] ConstWeightsQuantizeFusion: Fusing op.weight with QuantizeLinear_7_quantize_scale_node [02/14/2022-02:18:38] [V] [TRT] Running: ConstWeightsQuantizeFusion [02/14/2022-02:18:38] [V] [TRT] ConstWeightsQuantizeFusion: Fusing op1.weight with QuantizeLinear_17_quantize_scale_node [02/14/2022-02:18:38] [V] [TRT] QDQ graph optimizer quantization pass - Generate quantized ops [02/14/2022-02:18:38] [V] [TRT] Running: QuantizeDoubleInputNodes [02/14/2022-02:18:38] [V] [TRT] QuantizeDoubleInputNodes: fusing QuantizeLinear_12_quantize_scale_node into Conv_9 [02/14/2022-02:18:38] [V] [TRT] QuantizeDoubleInputNodes: fusing (DequantizeLinear_5_dequantize_scale_node and DequantizeLinear_8_dequantize_scale_node) into Conv_9 [02/14/2022-02:18:38] [V] [TRT] Removing QuantizeLinear_12_quantize_scale_node [02/14/2022-02:18:38] [V] [TRT] Removing DequantizeLinear_5_dequantize_scale_node [02/14/2022-02:18:38] [V] [TRT] Removing DequantizeLinear_8_dequantize_scale_node [02/14/2022-02:18:38] [V] [TRT] Running: QuantizeDoubleInputNodes [02/14/2022-02:18:38] [V] [TRT] QuantizeDoubleInputNodes: fusing (DequantizeLinear_15_dequantize_scale_node and DequantizeLinear_18_dequantize_scale_node) into Conv_19 [02/14/2022-02:18:38] [V] [TRT] Removing DequantizeLinear_15_dequantize_scale_node [02/14/2022-02:18:38] [V] [TRT] Removing DequantizeLinear_18_dequantize_scale_node [02/14/2022-02:18:38] [V] [TRT] Running: ConstWeightsFusion [02/14/2022-02:18:38] [V] [TRT] ConstWeightsFusion: Fusing op.weight + QuantizeLinear_7_quantize_scale_node with Conv_9 [02/14/2022-02:18:38] [V] [TRT] Running: ConstWeightsFusion [02/14/2022-02:18:38] [V] [TRT] ConstWeightsFusion: Fusing op1.weight + QuantizeLinear_17_quantize_scale_node with Conv_19 [02/14/2022-02:18:38] [V] [TRT] After vertical fusions: 3 layers [02/14/2022-02:18:38] [V] [TRT] After dupe layer removal: 3 layers [02/14/2022-02:18:38] [V] [TRT] After final dead-layer removal: 3 layers [02/14/2022-02:18:38] [V] [TRT] After tensor merging: 3 layers [02/14/2022-02:18:38] [V] [TRT] After concat removal: 3 layers [02/14/2022-02:18:38] [V] [TRT] Graph construction and optimization completed in 0.221649 seconds. [02/14/2022-02:18:39] [V] [TRT] Using cublasLt as a tactic source [02/14/2022-02:18:39] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +806, GPU +352, now: CPU 1474, GPU 4390 (MiB) [02/14/2022-02:18:39] [V] [TRT] Using cuDNN as a tactic source [02/14/2022-02:18:39] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +126, GPU +58, now: CPU 1600, GPU 4448 (MiB) [02/14/2022-02:18:39] [I] [TRT] Timing cache disabled. Turning it on will improve builder speed. [02/14/2022-02:18:39] [V] [TRT] Constructing optimization profile number 0 [1/1]. [02/14/2022-02:18:40] [V] [TRT] Reserving memory for activation tensors. Host: 0 bytes Device: 545259520 bytes [02/14/2022-02:18:40] [V] [TRT] =============== Computing reformatting costs [02/14/2022-02:18:40] [V] [TRT] =============== Computing reformatting costs [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(2097152,2097152,16384,128,1) -> Int8(524288,524288,16384:4,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: Optimizer Reformat(7 -> ) (Reformat) [02/14/2022-02:18:40] [V] [TRT] Tactic: 1002 Time: 0.025728 [02/14/2022-02:18:40] [V] [TRT] Tactic: 0 Time: 0.016256 [02/14/2022-02:18:40] [V] [TRT] Fastest Tactic: 0 Time: 0.016256 [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(2097152,2097152,16384,128,1) -> Int8(65536,65536,16384:32,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: Optimizer Reformat(7 -> ) (Reformat) [02/14/2022-02:18:40] [V] [TRT] Tactic: 1002 Time: 0.0256 [02/14/2022-02:18:40] [V] [TRT] Tactic: 0 Time: 0.063616 [02/14/2022-02:18:40] [V] [TRT] Fastest Tactic: 1002 Time: 0.0256 [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(2097152,2097152,16384,128,1) -> Int8(2097152,2097152:32,16384,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(524288,524288,16384:4,128,1) -> Int8(65536,65536,16384:32,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: Optimizer Reformat(7 -> ) (Reformat) [02/14/2022-02:18:40] [V] [TRT] Tactic: 1002 Time: 0.027776 [02/14/2022-02:18:40] [V] [TRT] Tactic: 0 Time: 0.017408 [02/14/2022-02:18:40] [V] [TRT] Fastest Tactic: 0 Time: 0.017408 [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(524288,524288,16384:4,128,1) -> Int8(2097152,2097152:32,16384,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(65536,65536,16384:32,128,1) -> Int8(524288,524288,16384:4,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: Optimizer Reformat(7 -> ) (Reformat) [02/14/2022-02:18:40] [V] [TRT] Tactic: 1002 Time: 0.034432 [02/14/2022-02:18:40] [V] [TRT] Tactic: 0 Time: 0.017152 [02/14/2022-02:18:40] [V] [TRT] Fastest Tactic: 0 Time: 0.017152 [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(65536,65536,16384:32,128,1) -> Int8(2097152,2097152:32,16384,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(2097152,2097152:32,16384,128,1) -> Int8(524288,524288,16384:4,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(2097152,2097152:32,16384,128,1) -> Int8(65536,65536,16384:32,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] =============== Computing reformatting costs [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(2097152,65536,16384:32,128,1) -> Int8(16777216,524288,16384:4,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: Optimizer Reformat(19 -> ) (Reformat) [02/14/2022-02:18:40] [V] [TRT] Tactic: 1002 Time: 0.64256 [02/14/2022-02:18:40] [V] [TRT] Tactic: 0 Time: 0.134528 [02/14/2022-02:18:40] [V] [TRT] Fastest Tactic: 0 Time: 0.134528 [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning Reformat: Int8(2097152,2097152:32,16384,128,1) -> Int8(16777216,524288,16384:4,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] =============== Computing reformatting costs [02/14/2022-02:18:40] [V] [TRT] =============== Computing costs for [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning format combination: Float(2097152,2097152,16384,128,1) -> Int8(2097152,2097152,16384,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: QuantizeLinear_2_quantize_scale_node (Scale) [02/14/2022-02:18:40] [V] [TRT] Tactic: 0 Time: 0.021632 [02/14/2022-02:18:40] [V] [TRT] Fastest Tactic: 0 Time: 0.021632 [02/14/2022-02:18:40] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Scale Tactic: 0 [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning format combination: Float(2097152,2097152,16384,128,1) -> Int8(524288,524288,16384:4,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: QuantizeLinear_2_quantize_scale_node (Scale) [02/14/2022-02:18:40] [V] [TRT] Tactic: 0 Time: 0.024832 [02/14/2022-02:18:40] [V] [TRT] Fastest Tactic: 0 Time: 0.024832 [02/14/2022-02:18:40] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Scale Tactic: 0 [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning format combination: Float(2097152,2097152,16384,128,1) -> Int8(65536,65536,16384:32,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: QuantizeLinear_2_quantize_scale_node (Scale) [02/14/2022-02:18:40] [V] [TRT] Tactic: 0 Time: 0.024704 [02/14/2022-02:18:40] [V] [TRT] Fastest Tactic: 0 Time: 0.024704 [02/14/2022-02:18:40] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Scale Tactic: 0 [02/14/2022-02:18:40] [V] [TRT] =============== Computing costs for [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning format combination: Int8(524288,524288,16384:4,128,1) -> Int8(16777216,524288,16384:4,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 (CudaDepthwiseConvolution) [02/14/2022-02:18:40] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning format combination: Int8(65536,65536,16384:32,128,1) -> Int8(2097152,65536,16384:32,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 (CudaGroupConvolution) [02/14/2022-02:18:40] [V] [TRT] CudaGroupConvolution has no valid tactics for this config, skipping [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 (CudaDepthwiseConvolution) [02/14/2022-02:18:40] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [02/14/2022-02:18:40] [V] [TRT] *************** Autotuning format combination: Int8(2097152,2097152:32,16384,128,1) -> Int8(2097152,2097152:32,16384,128,1) *************** [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 (CudnnConvolution) [02/14/2022-02:18:40] [V] [TRT] CudnnConvolution has no valid tactics for this config, skipping [02/14/2022-02:18:40] [V] [TRT] --------------- Timing Runner: op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 (CaskConvolution) [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x32x64_stage6_warpsize2x1x1_g1_tensor16x8x32_epifadd Tactic: 177040020707947851 [02/14/2022-02:18:40] [V] [TRT] Tactic: 177040020707947851 Time: 1.30227 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x96x64_stage3_warpsize4x1x1_g1_tensor16x8x32 Tactic: 328135613486708155 [02/14/2022-02:18:40] [V] [TRT] Tactic: 328135613486708155 Time: 3.47571 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x192x64_stage3_warpsize4x2x1_g1_tensor16x8x32 Tactic: 1111159740952609683 [02/14/2022-02:18:40] [V] [TRT] Tactic: 1111159740952609683 Time: 2.9888 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize96x96x64_stage3_warpsize2x2x1_g1_tensor16x8x32 Tactic: 1134860903395928905 [02/14/2022-02:18:40] [V] [TRT] Tactic: 1134860903395928905 Time: 1.6297 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x96x64_stage3_warpsize2x2x1_g1_tensor16x8x32 Tactic: 1276591930377039442 [02/14/2022-02:18:40] [V] [TRT] Tactic: 1276591930377039442 Time: 1.83744 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize32x96x64_stage5_warpsize2x1x1_g1_tensor16x8x32 Tactic: 1399501420456320585 [02/14/2022-02:18:40] [V] [TRT] Tactic: 1399501420456320585 Time: 2.05722 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize32x64x64_stage6_warpsize2x2x1_g1_tensor16x8x32_epifadd Tactic: 1550399266192842845 [02/14/2022-02:18:40] [V] [TRT] Tactic: 1550399266192842845 Time: 1.68269 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x128x64_stage1_warpsize2x2x1_g1_tensor8x8x16 Tactic: 2133329569091732311 [02/14/2022-02:18:40] [V] [TRT] Tactic: 2133329569091732311 Time: 1.87366 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x64x64_stage4_warpsize4x1x1_g1_tensor16x8x32 Tactic: 2325023763229477890 [02/14/2022-02:18:40] [V] [TRT] Tactic: 2325023763229477890 Time: 0.586112 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize32x64x64_stage1_warpsize2x2x1_g1_tensor8x8x16 Tactic: 2579824863892891529 [02/14/2022-02:18:40] [V] [TRT] Tactic: 2579824863892891529 Time: 1.3943 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize192x64x64_stage3_warpsize4x1x1_g1_tensor16x8x32 Tactic: 2783960536172159663 [02/14/2022-02:18:40] [V] [TRT] Tactic: 2783960536172159663 Time: 0.607872 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize48x128x64_stage3_warpsize1x2x1_g1_tensor16x8x32 Tactic: 2821711838552913693 [02/14/2022-02:18:40] [V] [TRT] Tactic: 2821711838552913693 Time: 1.16429 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x96x64_stage3_warpsize4x1x1_g1_tensor16x8x32 Tactic: 3456719996792527006 [02/14/2022-02:18:40] [V] [TRT] Tactic: 3456719996792527006 Time: 1.06598 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x128x64_stage6_warpsize2x2x1_g1_tensor16x8x32 Tactic: 4042202769383439184 [02/14/2022-02:18:40] [V] [TRT] Tactic: 4042202769383439184 Time: 1.33184 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x32x64_stage4_warpsize4x1x1_g1_tensor16x8x32_epifadd Tactic: 4259547356717612415 [02/14/2022-02:18:40] [V] [TRT] Tactic: 4259547356717612415 Time: 0.441856 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x256x64_stage4_warpsize2x4x1_g1_tensor16x8x32 Tactic: 4734519122557206480 [02/14/2022-02:18:40] [V] [TRT] Tactic: 4734519122557206480 Time: 2.51008 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x192x64_stage3_warpsize2x2x1_g1_tensor16x8x32 Tactic: 4922297020351187339 [02/14/2022-02:18:40] [V] [TRT] Tactic: 4922297020351187339 Time: 1.75936 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x256x64_stage4_warpsize2x4x1_g1_tensor16x8x32_epifadd Tactic: 5121596860264626879 [02/14/2022-02:18:40] [V] [TRT] Tactic: 5121596860264626879 Time: 2.3104 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x128x64_stage6_warpsize2x2x1_g1_tensor16x8x32_epifadd Tactic: 5158259316594207439 [02/14/2022-02:18:40] [V] [TRT] Tactic: 5158259316594207439 Time: 1.33965 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize96x32x64_stage4_warpsize2x1x1_g1_tensor16x8x32 Tactic: 5424417905073460656 [02/14/2022-02:18:40] [V] [TRT] Tactic: 5424417905073460656 Time: 0.502144 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize96x256x64_stage3_warpsize2x4x1_g1_tensor16x8x32 Tactic: 5442043907221427810 [02/14/2022-02:18:40] [V] [TRT] Tactic: 5442043907221427810 Time: 2.60826 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize32x32x64_stage1_warpsize2x1x1_g1_tensor8x8x16 Tactic: 6394572396369862482 [02/14/2022-02:18:40] [V] [TRT] Tactic: 6394572396369862482 Time: 0.884992 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x128x64_stage4_warpsize4x2x1_g1_tensor16x8x32_epifadd Tactic: 6434020722187266170 [02/14/2022-02:18:40] [V] [TRT] Tactic: 6434020722187266170 Time: 1.19091 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x64x64_stage6_warpsize2x2x1_g1_tensor16x8x32 Tactic: 6781129591847482048 [02/14/2022-02:18:40] [V] [TRT] Tactic: 6781129591847482048 Time: 0.740992 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize192x128x64_stage3_warpsize4x2x1_g1_tensor16x8x32 Tactic: 7077570591813340966 [02/14/2022-02:18:40] [V] [TRT] Tactic: 7077570591813340966 Time: 1.3129 [02/14/2022-02:18:40] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x128x64_stage4_warpsize2x2x1_g1_tensor16x8x32_epifadd Tactic: 7504901284678552178 [02/14/2022-02:18:41] [V] [TRT] Tactic: 7504901284678552178 Time: 1.09696 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_indexed_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x128x64_stage4_warpsize2x2x1_g1_tensor16x8x32_epifadd Tactic: 8751622450593766232 [02/14/2022-02:18:41] [V] [TRT] Tactic: 8751622450593766232 Time: 1.18925 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_indexed_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x128x64_stage4_warpsize2x2x1_g1_tensor16x8x32 Tactic: 9064458886956700976 [02/14/2022-02:18:41] [V] [TRT] Tactic: 9064458886956700976 Time: 1.20768 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x128x64_stage4_warpsize4x2x1_g1_tensor16x8x32 Tactic: -9165697322068360861 [02/14/2022-02:18:41] [V] [TRT] Tactic: -9165697322068360861 Time: 1.29203 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x64x64_stage1_warpsize2x2x1_g1_tensor8x8x16 Tactic: -9108166971364503411 [02/14/2022-02:18:41] [V] [TRT] Tactic: -9108166971364503411 Time: 1.12102 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x128x64_stage1_warpsize4x2x1_g1_tensor8x8x16 Tactic: -8861822316054763526 [02/14/2022-02:18:41] [V] [TRT] Tactic: -8861822316054763526 Time: 2.10611 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize96x128x64_stage3_warpsize2x2x1_g1_tensor16x8x32 Tactic: -8691377209893505057 [02/14/2022-02:18:41] [V] [TRT] Tactic: -8691377209893505057 Time: 1.09722 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize96x192x64_stage3_warpsize2x2x1_g1_tensor16x8x32 Tactic: -8520292213102999339 [02/14/2022-02:18:41] [V] [TRT] Tactic: -8520292213102999339 Time: 1.82554 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x128x64_stage4_warpsize2x2x1_g1_tensor16x8x32 Tactic: -8263994888336646547 [02/14/2022-02:18:41] [V] [TRT] Tactic: -8263994888336646547 Time: 1.08774 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize32x64x64_stage6_warpsize2x2x1_g1_tensor16x8x32 Tactic: -8205948405243401049 [02/14/2022-02:18:41] [V] [TRT] Tactic: -8205948405243401049 Time: 1.24941 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x64x64_stage4_warpsize4x1x1_g1_tensor16x8x32_epifadd Tactic: -7842775553137511386 [02/14/2022-02:18:41] [V] [TRT] Tactic: -7842775553137511386 Time: 0.59712 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x32x64_stage6_warpsize2x1x1_g1_tensor16x8x32 Tactic: -7683887278997527517 [02/14/2022-02:18:41] [V] [TRT] Tactic: -7683887278997527517 Time: 0.627456 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize96x64x64_stage3_warpsize2x2x1_g1_tensor16x8x32 Tactic: -7381370635708568663 [02/14/2022-02:18:41] [V] [TRT] Tactic: -7381370635708568663 Time: 0.686976 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize192x96x64_stage4_warpsize4x2x1_g1_tensor16x8x32 Tactic: -6256128573036943404 [02/14/2022-02:18:41] [V] [TRT] Tactic: -6256128573036943404 Time: 1.33875 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x64x64_stage1_warpsize4x1x1_g1_tensor8x8x16 Tactic: -5180570335464125033 [02/14/2022-02:18:41] [V] [TRT] Tactic: -5180570335464125033 Time: 1.01747 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x128x64_stage1_warpsize2x2x1_g1_tensor8x8x16 Tactic: -2499089240293650188 [02/14/2022-02:18:41] [V] [TRT] Tactic: -2499089240293650188 Time: 1.94957 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x64x64_stage1_warpsize2x2x1_g1_tensor8x8x16 Tactic: -2328318099174473157 [02/14/2022-02:18:41] [V] [TRT] Tactic: -2328318099174473157 Time: 1.12858 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x64x64_stage6_warpsize2x2x1_g1_tensor16x8x32_epifadd Tactic: -2083778562631872334 [02/14/2022-02:18:41] [V] [TRT] Tactic: -2083778562631872334 Time: 0.736256 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize48x128x64_stage3_warpsize1x4x1_g1_tensor16x8x32 Tactic: -2054375205435666404 [02/14/2022-02:18:41] [V] [TRT] Tactic: -2054375205435666404 Time: 1.37382 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize256x32x64_stage4_warpsize4x1x1_g1_tensor16x8x32 Tactic: -1498626619443284096 [02/14/2022-02:18:41] [V] [TRT] Tactic: -1498626619443284096 Time: 0.439936 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize32x32x64_stage6_warpsize2x1x1_g1_tensor16x8x32 Tactic: -1283580231568512025 [02/14/2022-02:18:41] [V] [TRT] Tactic: -1283580231568512025 Time: 1.10413 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize32x32x64_stage6_warpsize2x1x1_g1_tensor16x8x32_epifadd Tactic: -1173968681844185579 [02/14/2022-02:18:41] [V] [TRT] Tactic: -1173968681844185579 Time: 1.1017 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x64x64_stage6_warpsize2x2x1_g1_tensor16x8x32 Tactic: -762222380308749469 [02/14/2022-02:18:41] [V] [TRT] Tactic: -762222380308749469 Time: 0.875904 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm80_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x64x64_stage6_warpsize2x2x1_g1_tensor16x8x32_epifadd Tactic: -556794153877490941 [02/14/2022-02:18:41] [V] [TRT] Tactic: -556794153877490941 Time: 0.902528 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize64x32x64_stage1_warpsize2x1x1_g1_tensor8x8x16 Tactic: -366411318217594794 [02/14/2022-02:18:41] [V] [TRT] Tactic: -366411318217594794 Time: 0.669184 [02/14/2022-02:18:41] [V] [TRT] op.weight + QuantizeLinear_7_quantize_scale_node + Conv_9 Set Tactic Name: sm75_xmma_fprop_implicit_gemm_interleaved_i8i8_i8i32_f32_nchw_vect_c_32kcrs_vect_c_32_nchw_vect_c_32_tilesize128x256x64_stage1_warpsize2x4x1_g1_tensor8x8x16 Tactic: -351548418071036983 [02/14/2022-02:18:41] [V] [TRT] Tactic: -351548418071036983 Time: 4.1481 [02/14/2022-02:18:41] [V] [TRT] Fastest Tactic: -1498626619443284096 Time: 0.439936 [02/14/2022-02:18:41] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -1498626619443284096 [02/14/2022-02:18:41] [V] [TRT] =============== Computing costs for [02/14/2022-02:18:41] [V] [TRT] *************** Autotuning format combination: Int8(16777216,524288,16384:4,128,1) -> Float(134217728,2097152,16384,128,1) *************** [02/14/2022-02:18:41] [V] [TRT] --------------- Timing Runner: op1.weight + QuantizeLinear_17_quantize_scale_node + Conv_19 (CudaDepthwiseConvolution) [02/14/2022-02:18:41] [V] [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [02/14/2022-02:18:41] [E] Error[10]: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node op1.weight + QuantizeLinear_17_quantize_scale_node + Conv_19.) [02/14/2022-02:18:41] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. ) [02/14/2022-02:18:41] [E] Engine could not be created from network [02/14/2022-02:18:41] [E] Building engine failed [02/14/2022-02:18:41] [E] Failed to create engine from model. [02/14/2022-02:18:41] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8201] # trtexec --onnx=pyt_test_3d.onnx --int8 --workspace=8192 --noBuilderCache --verbose