[TRT] h_classifier --> imageNet -- create.....1111 [TRT] h_classifier --> imageNet -- init.....1111 imageNet -- loading classification network model from: -- prototxt (null) -- model networks/h_classifier/h_classifier.onnx -- class_labels networks/h_classifier/labels.txt -- input_blob 'data' -- output_blob 'prob' -- batch_size 1 [TRT] TensorRT version 7.1.0 [TRT] loading NVIDIA plugins... [TRT] Plugin creator registration succeeded - ::GridAnchor_TRT [TRT] Plugin creator registration succeeded - ::NMS_TRT [TRT] Plugin creator registration succeeded - ::Reorg_TRT [TRT] Plugin creator registration succeeded - ::Region_TRT [TRT] Plugin creator registration succeeded - ::Clip_TRT [TRT] Plugin creator registration succeeded - ::LReLU_TRT [TRT] Plugin creator registration succeeded - ::PriorBox_TRT [TRT] Plugin creator registration succeeded - ::Normalize_TRT [TRT] Plugin creator registration succeeded - ::RPROI_TRT [TRT] Plugin creator registration succeeded - ::BatchedNMS_TRT [TRT] Plugin creator registration succeeded - ::FlattenConcat_TRT [TRT] Plugin creator registration succeeded - ::CropAndResize [TRT] Plugin creator registration succeeded - ::DetectionLayer_TRT [TRT] Plugin creator registration succeeded - ::Proposal [TRT] Plugin creator registration succeeded - ::ProposalLayer_TRT [TRT] Plugin creator registration succeeded - ::PyramidROIAlign_TRT [TRT] Plugin creator registration succeeded - ::ResizeNearest_TRT [TRT] Plugin creator registration succeeded - ::Split [TRT] Plugin creator registration succeeded - ::SpecialSlice_TRT [TRT] Plugin creator registration succeeded - ::InstanceNormalization_TRT [TRT] completed loading NVIDIA plugins. [TRT] detected model format - ONNX (extension '.onnx') [TRT] desired precision specified for GPU: FASTEST [TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8 [TRT] native precisions detected for GPU: FP32, FP16 [TRT] selecting fastest native precision for GPU: FP16 [TRT] attempting to open engine cache file networks/h_classifier/h_classifier.onnx.1.1.GPU.FP16.engine [TRT] cache file not found, profiling network model on device GPU [TRT] device GPU, loading /usr/local/sys_dev/test_jetson/aarch64/bin/ networks/h_classifier/h_classifier.onnx ---------------------------------------------------------------- Input filename: networks/h_classifier/h_classifier.onnx ONNX IR version: 0.0.6 Opset version: 9 Producer name: pytorch Producer version: 1.5 Domain: Model version: 0 Doc string: ---------------------------------------------------------------- [TRT] Plugin creator already registered - ::GridAnchor_TRT [TRT] Plugin creator already registered - ::NMS_TRT [TRT] Plugin creator already registered - ::Reorg_TRT [TRT] Plugin creator already registered - ::Region_TRT [TRT] Plugin creator already registered - ::Clip_TRT [TRT] Plugin creator already registered - ::LReLU_TRT [TRT] Plugin creator already registered - ::PriorBox_TRT [TRT] Plugin creator already registered - ::Normalize_TRT [TRT] Plugin creator already registered - ::RPROI_TRT [TRT] Plugin creator already registered - ::BatchedNMS_TRT [TRT] Plugin creator already registered - ::FlattenConcat_TRT [TRT] Plugin creator already registered - ::CropAndResize [TRT] Plugin creator already registered - ::DetectionLayer_TRT [TRT] Plugin creator already registered - ::Proposal [TRT] Plugin creator already registered - ::ProposalLayer_TRT [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT [TRT] Plugin creator already registered - ::ResizeNearest_TRT [TRT] Plugin creator already registered - ::Split [TRT] Plugin creator already registered - ::SpecialSlice_TRT [TRT] Plugin creator already registered - ::InstanceNormalization_TRT [TRT] ModelImporter.cpp:205: Adding network input: input.1 with dtype: float32, dimensions: (1, 3, 150, 150) [TRT] ImporterContext.hpp:97: Registering tensor: input.1 for ONNX tensor: input.1 [TRT] ModelImporter.cpp:90: Importing initializer: bn1.bias [TRT] ModelImporter.cpp:90: Importing initializer: bn1.running_mean [TRT] ModelImporter.cpp:90: Importing initializer: bn1.running_var [TRT] ModelImporter.cpp:90: Importing initializer: bn1.weight [TRT] ModelImporter.cpp:90: Importing initializer: bn2.bias [TRT] ModelImporter.cpp:90: Importing initializer: bn2.running_mean [TRT] ModelImporter.cpp:90: Importing initializer: bn2.running_var [TRT] ModelImporter.cpp:90: Importing initializer: bn2.weight [TRT] ModelImporter.cpp:90: Importing initializer: bn3.bias [TRT] ModelImporter.cpp:90: Importing initializer: bn3.running_mean [TRT] ModelImporter.cpp:90: Importing initializer: bn3.running_var [TRT] ModelImporter.cpp:90: Importing initializer: bn3.weight [TRT] ModelImporter.cpp:90: Importing initializer: bn4.bias [TRT] ModelImporter.cpp:90: Importing initializer: bn4.running_mean [TRT] ModelImporter.cpp:90: Importing initializer: bn4.running_var [TRT] ModelImporter.cpp:90: Importing initializer: bn4.weight [TRT] ModelImporter.cpp:90: Importing initializer: conv1.bias [TRT] ModelImporter.cpp:90: Importing initializer: conv1.weight [TRT] ModelImporter.cpp:90: Importing initializer: conv2.bias [TRT] ModelImporter.cpp:90: Importing initializer: conv2.weight [TRT] ModelImporter.cpp:90: Importing initializer: conv3.bias [TRT] ModelImporter.cpp:90: Importing initializer: conv3.weight [TRT] ModelImporter.cpp:90: Importing initializer: conv4.bias [TRT] ModelImporter.cpp:90: Importing initializer: conv4.weight [TRT] ModelImporter.cpp:90: Importing initializer: conv5.bias [TRT] ModelImporter.cpp:90: Importing initializer: conv5.weight [TRT] ModelImporter.cpp:90: Importing initializer: fc1.bias [TRT] ModelImporter.cpp:90: Importing initializer: fc1.weight [TRT] ModelImporter.cpp:90: Importing initializer: fc2.bias [TRT] ModelImporter.cpp:90: Importing initializer: fc2.weight [TRT] ModelImporter.cpp:90: Importing initializer: fc3.bias [TRT] ModelImporter.cpp:90: Importing initializer: fc3.weight [TRT] ModelImporter.cpp:103: Parsing node: Conv_0 [Conv] [TRT] ModelImporter.cpp:119: Searching for input: input.1 [TRT] ModelImporter.cpp:119: Searching for input: conv1.weight [TRT] ModelImporter.cpp:119: Searching for input: conv1.bias [TRT] ModelImporter.cpp:125: Conv_0 [Conv] inputs: [input.1 -> (1, 3, 150, 150)], [conv1.weight -> (32, 3, 3, 3)], [conv1.bias -> (32)], [TRT] builtin_op_importers.cpp:446: Convolution input dimensions: (1, 3, 150, 150) [TRT] builtin_op_importers.cpp:528: Using kernel: (3, 3), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 32 [TRT] builtin_op_importers.cpp:529: Convolution output dimensions: (1, 32, 152, 152) [TRT] ImporterContext.hpp:122: Registering layer: Conv_0 for ONNX node: Conv_0 [TRT] ImporterContext.hpp:97: Registering tensor: 49 for ONNX tensor: 49 [TRT] ModelImporter.cpp:182: Conv_0 [Conv] outputs: [49 -> (1, 32, 152, 152)], [TRT] ModelImporter.cpp:103: Parsing node: Relu_1 [Relu] [TRT] ModelImporter.cpp:119: Searching for input: 49 [TRT] ModelImporter.cpp:125: Relu_1 [Relu] inputs: [49 -> (1, 32, 152, 152)], [TRT] ImporterContext.hpp:122: Registering layer: Relu_1 for ONNX node: Relu_1 [TRT] ImporterContext.hpp:97: Registering tensor: 50 for ONNX tensor: 50 [TRT] ModelImporter.cpp:182: Relu_1 [Relu] outputs: [50 -> (1, 32, 152, 152)], [TRT] ModelImporter.cpp:103: Parsing node: MaxPool_2 [MaxPool] [TRT] ModelImporter.cpp:119: Searching for input: 50 [TRT] ModelImporter.cpp:125: MaxPool_2 [MaxPool] inputs: [50 -> (1, 32, 152, 152)], [TRT] ImporterContext.hpp:122: Registering layer: MaxPool_2 for ONNX node: MaxPool_2 [TRT] ImporterContext.hpp:97: Registering tensor: 51 for ONNX tensor: 51 [TRT] ModelImporter.cpp:182: MaxPool_2 [MaxPool] outputs: [51 -> (1, 32, 76, 76)], [TRT] ModelImporter.cpp:103: Parsing node: BatchNormalization_3 [BatchNormalization] [TRT] ModelImporter.cpp:119: Searching for input: 51 [TRT] ModelImporter.cpp:119: Searching for input: bn1.weight [TRT] ModelImporter.cpp:119: Searching for input: bn1.bias [TRT] ModelImporter.cpp:119: Searching for input: bn1.running_mean [TRT] ModelImporter.cpp:119: Searching for input: bn1.running_var [TRT] ModelImporter.cpp:125: BatchNormalization_3 [BatchNormalization] inputs: [51 -> (1, 32, 76, 76)], [bn1.weight -> (32)], [bn1.bias -> (32)], [bn1.running_mean -> (32)], [bn1.running_var -> (32)], [TRT] ImporterContext.hpp:122: Registering layer: BatchNormalization_3 for ONNX node: BatchNormalization_3 [TRT] ImporterContext.hpp:97: Registering tensor: 52 for ONNX tensor: 52 [TRT] ModelImporter.cpp:182: BatchNormalization_3 [BatchNormalization] outputs: [52 -> (1, 32, 76, 76)], [TRT] ModelImporter.cpp:103: Parsing node: Conv_4 [Conv] [TRT] ModelImporter.cpp:119: Searching for input: 52 [TRT] ModelImporter.cpp:119: Searching for input: conv2.weight [TRT] ModelImporter.cpp:119: Searching for input: conv2.bias [TRT] ModelImporter.cpp:125: Conv_4 [Conv] inputs: [52 -> (1, 32, 76, 76)], [conv2.weight -> (64, 32, 3, 3)], [conv2.bias -> (64)], [TRT] builtin_op_importers.cpp:446: Convolution input dimensions: (1, 32, 76, 76) [TRT] builtin_op_importers.cpp:528: Using kernel: (3, 3), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 64 [TRT] builtin_op_importers.cpp:529: Convolution output dimensions: (1, 64, 78, 78) [TRT] ImporterContext.hpp:122: Registering layer: Conv_4 for ONNX node: Conv_4 [TRT] ImporterContext.hpp:97: Registering tensor: 53 for ONNX tensor: 53 [TRT] ModelImporter.cpp:182: Conv_4 [Conv] outputs: [53 -> (1, 64, 78, 78)], [TRT] ModelImporter.cpp:103: Parsing node: Relu_5 [Relu] [TRT] ModelImporter.cpp:119: Searching for input: 53 [TRT] ModelImporter.cpp:125: Relu_5 [Relu] inputs: [53 -> (1, 64, 78, 78)], [TRT] ImporterContext.hpp:122: Registering layer: Relu_5 for ONNX node: Relu_5 [TRT] ImporterContext.hpp:97: Registering tensor: 54 for ONNX tensor: 54 [TRT] ModelImporter.cpp:182: Relu_5 [Relu] outputs: [54 -> (1, 64, 78, 78)], [TRT] ModelImporter.cpp:103: Parsing node: MaxPool_6 [MaxPool] [TRT] ModelImporter.cpp:119: Searching for input: 54 [TRT] ModelImporter.cpp:125: MaxPool_6 [MaxPool] inputs: [54 -> (1, 64, 78, 78)], [TRT] ImporterContext.hpp:122: Registering layer: MaxPool_6 for ONNX node: MaxPool_6 [TRT] ImporterContext.hpp:97: Registering tensor: 55 for ONNX tensor: 55 [TRT] ModelImporter.cpp:182: MaxPool_6 [MaxPool] outputs: [55 -> (1, 64, 39, 39)], [TRT] ModelImporter.cpp:103: Parsing node: BatchNormalization_7 [BatchNormalization] [TRT] ModelImporter.cpp:119: Searching for input: 55 [TRT] ModelImporter.cpp:119: Searching for input: bn2.weight [TRT] ModelImporter.cpp:119: Searching for input: bn2.bias [TRT] ModelImporter.cpp:119: Searching for input: bn2.running_mean [TRT] ModelImporter.cpp:119: Searching for input: bn2.running_var [TRT] ModelImporter.cpp:125: BatchNormalization_7 [BatchNormalization] inputs: [55 -> (1, 64, 39, 39)], [bn2.weight -> (64)], [bn2.bias -> (64)], [bn2.running_mean -> (64)], [bn2.running_var -> (64)], [TRT] ImporterContext.hpp:122: Registering layer: BatchNormalization_7 for ONNX node: BatchNormalization_7 [TRT] ImporterContext.hpp:97: Registering tensor: 56 for ONNX tensor: 56 [TRT] ModelImporter.cpp:182: BatchNormalization_7 [BatchNormalization] outputs: [56 -> (1, 64, 39, 39)], [TRT] ModelImporter.cpp:103: Parsing node: Conv_8 [Conv] [TRT] ModelImporter.cpp:119: Searching for input: 56 [TRT] ModelImporter.cpp:119: Searching for input: conv3.weight [TRT] ModelImporter.cpp:119: Searching for input: conv3.bias [TRT] ModelImporter.cpp:125: Conv_8 [Conv] inputs: [56 -> (1, 64, 39, 39)], [conv3.weight -> (128, 64, 3, 3)], [conv3.bias -> (128)], [TRT] builtin_op_importers.cpp:446: Convolution input dimensions: (1, 64, 39, 39) [TRT] builtin_op_importers.cpp:528: Using kernel: (3, 3), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 128 [TRT] builtin_op_importers.cpp:529: Convolution output dimensions: (1, 128, 41, 41) [TRT] ImporterContext.hpp:122: Registering layer: Conv_8 for ONNX node: Conv_8 [TRT] ImporterContext.hpp:97: Registering tensor: 57 for ONNX tensor: 57 [TRT] ModelImporter.cpp:182: Conv_8 [Conv] outputs: [57 -> (1, 128, 41, 41)], [TRT] ModelImporter.cpp:103: Parsing node: Relu_9 [Relu] [TRT] ModelImporter.cpp:119: Searching for input: 57 [TRT] ModelImporter.cpp:125: Relu_9 [Relu] inputs: [57 -> (1, 128, 41, 41)], [TRT] ImporterContext.hpp:122: Registering layer: Relu_9 for ONNX node: Relu_9 [TRT] ImporterContext.hpp:97: Registering tensor: 58 for ONNX tensor: 58 [TRT] ModelImporter.cpp:182: Relu_9 [Relu] outputs: [58 -> (1, 128, 41, 41)], [TRT] ModelImporter.cpp:103: Parsing node: MaxPool_10 [MaxPool] [TRT] ModelImporter.cpp:119: Searching for input: 58 [TRT] ModelImporter.cpp:125: MaxPool_10 [MaxPool] inputs: [58 -> (1, 128, 41, 41)], [TRT] ImporterContext.hpp:122: Registering layer: MaxPool_10 for ONNX node: MaxPool_10 [TRT] ImporterContext.hpp:97: Registering tensor: 59 for ONNX tensor: 59 [TRT] ModelImporter.cpp:182: MaxPool_10 [MaxPool] outputs: [59 -> (1, 128, 20, 20)], [TRT] ModelImporter.cpp:103: Parsing node: BatchNormalization_11 [BatchNormalization] [TRT] ModelImporter.cpp:119: Searching for input: 59 [TRT] ModelImporter.cpp:119: Searching for input: bn3.weight [TRT] ModelImporter.cpp:119: Searching for input: bn3.bias [TRT] ModelImporter.cpp:119: Searching for input: bn3.running_mean [TRT] ModelImporter.cpp:119: Searching for input: bn3.running_var [TRT] ModelImporter.cpp:125: BatchNormalization_11 [BatchNormalization] inputs: [59 -> (1, 128, 20, 20)], [bn3.weight -> (128)], [bn3.bias -> (128)], [bn3.running_mean -> (128)], [bn3.running_var -> (128)], [TRT] ImporterContext.hpp:122: Registering layer: BatchNormalization_11 for ONNX node: BatchNormalization_11 [TRT] ImporterContext.hpp:97: Registering tensor: 60 for ONNX tensor: 60 [TRT] ModelImporter.cpp:182: BatchNormalization_11 [BatchNormalization] outputs: [60 -> (1, 128, 20, 20)], [TRT] ModelImporter.cpp:103: Parsing node: Conv_12 [Conv] [TRT] ModelImporter.cpp:119: Searching for input: 60 [TRT] ModelImporter.cpp:119: Searching for input: conv4.weight [TRT] ModelImporter.cpp:119: Searching for input: conv4.bias [TRT] ModelImporter.cpp:125: Conv_12 [Conv] inputs: [60 -> (1, 128, 20, 20)], [conv4.weight -> (256, 128, 3, 3)], [conv4.bias -> (256)], [TRT] builtin_op_importers.cpp:446: Convolution input dimensions: (1, 128, 20, 20) [TRT] builtin_op_importers.cpp:528: Using kernel: (3, 3), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 256 [TRT] builtin_op_importers.cpp:529: Convolution output dimensions: (1, 256, 22, 22) [TRT] ImporterContext.hpp:122: Registering layer: Conv_12 for ONNX node: Conv_12 [TRT] ImporterContext.hpp:97: Registering tensor: 61 for ONNX tensor: 61 [TRT] ModelImporter.cpp:182: Conv_12 [Conv] outputs: [61 -> (1, 256, 22, 22)], [TRT] ModelImporter.cpp:103: Parsing node: Relu_13 [Relu] [TRT] ModelImporter.cpp:119: Searching for input: 61 [TRT] ModelImporter.cpp:125: Relu_13 [Relu] inputs: [61 -> (1, 256, 22, 22)], [TRT] ImporterContext.hpp:122: Registering layer: Relu_13 for ONNX node: Relu_13 [TRT] ImporterContext.hpp:97: Registering tensor: 62 for ONNX tensor: 62 [TRT] ModelImporter.cpp:182: Relu_13 [Relu] outputs: [62 -> (1, 256, 22, 22)], [TRT] ModelImporter.cpp:103: Parsing node: MaxPool_14 [MaxPool] [TRT] ModelImporter.cpp:119: Searching for input: 62 [TRT] ModelImporter.cpp:125: MaxPool_14 [MaxPool] inputs: [62 -> (1, 256, 22, 22)], [TRT] ImporterContext.hpp:122: Registering layer: MaxPool_14 for ONNX node: MaxPool_14 [TRT] ImporterContext.hpp:97: Registering tensor: 63 for ONNX tensor: 63 [TRT] ModelImporter.cpp:182: MaxPool_14 [MaxPool] outputs: [63 -> (1, 256, 11, 11)], [TRT] ModelImporter.cpp:103: Parsing node: BatchNormalization_15 [BatchNormalization] [TRT] ModelImporter.cpp:119: Searching for input: 63 [TRT] ModelImporter.cpp:119: Searching for input: bn4.weight [TRT] ModelImporter.cpp:119: Searching for input: bn4.bias [TRT] ModelImporter.cpp:119: Searching for input: bn4.running_mean [TRT] ModelImporter.cpp:119: Searching for input: bn4.running_var [TRT] ModelImporter.cpp:125: BatchNormalization_15 [BatchNormalization] inputs: [63 -> (1, 256, 11, 11)], [bn4.weight -> (256)], [bn4.bias -> (256)], [bn4.running_mean -> (256)], [bn4.running_var -> (256)], [TRT] ImporterContext.hpp:122: Registering layer: BatchNormalization_15 for ONNX node: BatchNormalization_15 [TRT] ImporterContext.hpp:97: Registering tensor: 64 for ONNX tensor: 64 [TRT] ModelImporter.cpp:182: BatchNormalization_15 [BatchNormalization] outputs: [64 -> (1, 256, 11, 11)], [TRT] ModelImporter.cpp:103: Parsing node: Conv_16 [Conv] [TRT] ModelImporter.cpp:119: Searching for input: 64 [TRT] ModelImporter.cpp:119: Searching for input: conv5.weight [TRT] ModelImporter.cpp:119: Searching for input: conv5.bias [TRT] ModelImporter.cpp:125: Conv_16 [Conv] inputs: [64 -> (1, 256, 11, 11)], [conv5.weight -> (512, 256, 3, 3)], [conv5.bias -> (512)], [TRT] builtin_op_importers.cpp:446: Convolution input dimensions: (1, 256, 11, 11) [TRT] builtin_op_importers.cpp:528: Using kernel: (3, 3), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 512 [TRT] builtin_op_importers.cpp:529: Convolution output dimensions: (1, 512, 13, 13) [TRT] ImporterContext.hpp:122: Registering layer: Conv_16 for ONNX node: Conv_16 [TRT] ImporterContext.hpp:97: Registering tensor: 65 for ONNX tensor: 65 [TRT] ModelImporter.cpp:182: Conv_16 [Conv] outputs: [65 -> (1, 512, 13, 13)], [TRT] ModelImporter.cpp:103: Parsing node: Relu_17 [Relu] [TRT] ModelImporter.cpp:119: Searching for input: 65 [TRT] ModelImporter.cpp:125: Relu_17 [Relu] inputs: [65 -> (1, 512, 13, 13)], [TRT] ImporterContext.hpp:122: Registering layer: Relu_17 for ONNX node: Relu_17 [TRT] ImporterContext.hpp:97: Registering tensor: 66 for ONNX tensor: 66 [TRT] ModelImporter.cpp:182: Relu_17 [Relu] outputs: [66 -> (1, 512, 13, 13)], [TRT] ModelImporter.cpp:103: Parsing node: MaxPool_18 [MaxPool] [TRT] ModelImporter.cpp:119: Searching for input: 66 [TRT] ModelImporter.cpp:125: MaxPool_18 [MaxPool] inputs: [66 -> (1, 512, 13, 13)], [TRT] ImporterContext.hpp:122: Registering layer: MaxPool_18 for ONNX node: MaxPool_18 [TRT] ImporterContext.hpp:97: Registering tensor: 67 for ONNX tensor: 67 [TRT] ModelImporter.cpp:182: MaxPool_18 [MaxPool] outputs: [67 -> (1, 512, 6, 6)], [TRT] ModelImporter.cpp:103: Parsing node: Constant_19 [Constant] [TRT] ModelImporter.cpp:125: Constant_19 [Constant] inputs: [TRT] onnx2trt_utils.cpp:217: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [TRT] ModelImporter.cpp:182: Constant_19 [Constant] outputs: [68 -> (2)], [TRT] ModelImporter.cpp:103: Parsing node: Reshape_20 [Reshape] [TRT] ModelImporter.cpp:119: Searching for input: 67 [TRT] ModelImporter.cpp:119: Searching for input: 68 [TRT] ModelImporter.cpp:125: Reshape_20 [Reshape] inputs: [67 -> (1, 512, 6, 6)], [68 -> (2)], [TRT] ImporterContext.hpp:122: Registering layer: Reshape_20 for ONNX node: Reshape_20 [TRT] ImporterContext.hpp:97: Registering tensor: 69 for ONNX tensor: 69 [TRT] ModelImporter.cpp:182: Reshape_20 [Reshape] outputs: [69 -> (1, 18432)], [TRT] ModelImporter.cpp:103: Parsing node: Gemm_21 [Gemm] [TRT] ModelImporter.cpp:119: Searching for input: 69 [TRT] ModelImporter.cpp:119: Searching for input: fc1.weight [TRT] ModelImporter.cpp:119: Searching for input: fc1.bias [TRT] ModelImporter.cpp:125: Gemm_21 [Gemm] inputs: [69 -> (1, 18432)], [fc1.weight -> (512, 18432)], [fc1.bias -> (512)], [TRT] builtin_op_importers.cpp:1067: Using opA: 0 opB: 0 [TRT] builtin_op_importers.cpp:1068: GEMM: A, after squeezing: (1, 18432) [TRT] ImporterContext.hpp:122: Registering layer: Gemm_21 for ONNX node: Gemm_21 [TRT] ImporterContext.hpp:97: Registering tensor: 70 for ONNX tensor: 70 [TRT] ModelImporter.cpp:182: Gemm_21 [Gemm] outputs: [70 -> (1, 512)], [TRT] ModelImporter.cpp:103: Parsing node: Relu_22 [Relu] [TRT] ModelImporter.cpp:119: Searching for input: 70 [TRT] ModelImporter.cpp:125: Relu_22 [Relu] inputs: [70 -> (1, 512)], [TRT] ImporterContext.hpp:122: Registering layer: Relu_22 for ONNX node: Relu_22 [TRT] ImporterContext.hpp:97: Registering tensor: 71 for ONNX tensor: 71 [TRT] ModelImporter.cpp:182: Relu_22 [Relu] outputs: [71 -> (1, 512)], [TRT] ModelImporter.cpp:103: Parsing node: Gemm_23 [Gemm] [TRT] ModelImporter.cpp:119: Searching for input: 71 [TRT] ModelImporter.cpp:119: Searching for input: fc2.weight [TRT] ModelImporter.cpp:119: Searching for input: fc2.bias [TRT] ModelImporter.cpp:125: Gemm_23 [Gemm] inputs: [71 -> (1, 512)], [fc2.weight -> (100, 512)], [fc2.bias -> (100)], [TRT] builtin_op_importers.cpp:1067: Using opA: 0 opB: 0 [TRT] builtin_op_importers.cpp:1068: GEMM: A, after squeezing: (1, 512) [TRT] ImporterContext.hpp:122: Registering layer: Gemm_23 for ONNX node: Gemm_23 [TRT] ImporterContext.hpp:97: Registering tensor: 72 for ONNX tensor: 72 [TRT] ModelImporter.cpp:182: Gemm_23 [Gemm] outputs: [72 -> (1, 100)], [TRT] ModelImporter.cpp:103: Parsing node: Relu_24 [Relu] [TRT] ModelImporter.cpp:119: Searching for input: 72 [TRT] ModelImporter.cpp:125: Relu_24 [Relu] inputs: [72 -> (1, 100)], [TRT] ImporterContext.hpp:122: Registering layer: Relu_24 for ONNX node: Relu_24 [TRT] ImporterContext.hpp:97: Registering tensor: 73 for ONNX tensor: 73 [TRT] ModelImporter.cpp:182: Relu_24 [Relu] outputs: [73 -> (1, 100)], [TRT] ModelImporter.cpp:103: Parsing node: Gemm_25 [Gemm] [TRT] ModelImporter.cpp:119: Searching for input: 73 [TRT] ModelImporter.cpp:119: Searching for input: fc3.weight [TRT] ModelImporter.cpp:119: Searching for input: fc3.bias [TRT] ModelImporter.cpp:125: Gemm_25 [Gemm] inputs: [73 -> (1, 100)], [fc3.weight -> (2, 100)], [fc3.bias -> (2)], [TRT] builtin_op_importers.cpp:1067: Using opA: 0 opB: 0 [TRT] builtin_op_importers.cpp:1068: GEMM: A, after squeezing: (1, 100) [TRT] ImporterContext.hpp:122: Registering layer: Gemm_25 for ONNX node: Gemm_25 [TRT] ImporterContext.hpp:97: Registering tensor: 74 for ONNX tensor: 74 [TRT] ModelImporter.cpp:182: Gemm_25 [Gemm] outputs: [74 -> (1, 2)], [TRT] ModelImporter.cpp:103: Parsing node: Relu_26 [Relu] [TRT] ModelImporter.cpp:119: Searching for input: 74 [TRT] ModelImporter.cpp:125: Relu_26 [Relu] inputs: [74 -> (1, 2)], [TRT] ImporterContext.hpp:122: Registering layer: Relu_26 for ONNX node: Relu_26 [TRT] ImporterContext.hpp:97: Registering tensor: 75 for ONNX tensor: 75 [TRT] ModelImporter.cpp:182: Relu_26 [Relu] outputs: [75 -> (1, 2)], [TRT] ModelImporter.cpp:103: Parsing node: Softmax_27 [Softmax] [TRT] ModelImporter.cpp:119: Searching for input: 75 [TRT] ModelImporter.cpp:125: Softmax_27 [Softmax] inputs: [75 -> (1, 2)], [TRT] ImporterContext.hpp:122: Registering layer: Softmax_27 for ONNX node: Softmax_27 [TRT] ImporterContext.hpp:97: Registering tensor: 76_1 for ONNX tensor: 76 [TRT] ModelImporter.cpp:182: Softmax_27 [Softmax] outputs: [76 -> (1, 2)], [TRT] ModelImporter.cpp:494: Marking 76_1 as output: 76 [TRT] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result. [TRT] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result. [TRT] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result. ----- Parsing of ONNX model networks/h_classifier/h_classifier.onnx is Done ---- [TRT] device GPU, configuring CUDA engine [TRT] device GPU, building FP16: ON [TRT] device GPU, building INT8: OFF [TRT] device GPU, building CUDA engine (this may take a few minutes the first time a network is loaded) [TRT] Applying generic optimizations to the graph for inference. [TRT] Original: 41 layers [TRT] After dead-layer removal: 41 layers [TRT] Fusing (Unnamed Layer* 23) [Constant] with (Unnamed Layer* 24) [Shuffle] [TRT] Fusing (Unnamed Layer* 30) [Constant] with (Unnamed Layer* 31) [Shuffle] [TRT] Fusing (Unnamed Layer* 37) [Constant] with (Unnamed Layer* 38) [Shuffle] [TRT] Removing Softmax_27 [TRT] Removing (Unnamed Layer* 43) [Shuffle] [TRT] After Myelin optimization: 36 layers [TRT] After scale fusion: 36 layers [TRT] Fusing Conv_0 with Relu_1 [TRT] Fusing Conv_4 with Relu_5 [TRT] Fusing Conv_8 with Relu_9 [TRT] Fusing Conv_12 with Relu_13 [TRT] Fusing Conv_16 with Relu_17 [TRT] Fusing (Unnamed Layer* 25) [ElementWise] with Relu_22 [TRT] Fusing (Unnamed Layer* 32) [ElementWise] with Relu_24 [TRT] Fusing (Unnamed Layer* 39) [ElementWise] with Relu_26 [TRT] After vertical fusions: 28 layers [TRT] After final dead-layer removal: 28 layers [TRT] After tensor merging: 28 layers [TRT] After concat removal: 28 layers [TRT] Graph construction and optimization completed in 0.0117134 seconds. [TRT] Constructing optimization profile number 0 [1/1]. [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.203281 [TRT] Tactic: 0 time 0.426224 [TRT] Fastest Tactic: 1002 Time: 0.203281 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 2.26687 [TRT] Tactic: 0 time 0.152213 [TRT] Fastest Tactic: 0 Time: 0.152213 [TRT] *************** Autotuning format combination: Float(1,150,22500,67500) -> Float(1,152,23104,739328) *************** [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution) [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Tactic: 1062367460111450758 time 2.42177 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Tactic: 3827454225649558724 time 2.44651 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Tactic: 4337000649858996379 time 1.60622 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Tactic: 4501471010995462441 time 3.11177 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Tactic: 5137655947464784826 time 1.56437 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 5921334924264294896 time 1.66977 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Tactic: 6645123197870846056 time 1.57643 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Tactic: 7852627285308570038 time 2.59016 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Tactic: -9137461792520977713 time 3.14052 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Tactic: -8776506421218919509 time 2.10018 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Tactic: -6092040395344634144 time 0.949219 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Tactic: -3456450830548107839 time 0.852891 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Tactic: -2318106587342035239 time 2.20391 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Tactic: -1343271414618805657 time 1.42659 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Tactic: -410470605513481746 time 2.66724 [TRT] Fastest Tactic: -3456450830548107839 Time: 0.852891 [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CudaConvolution) [TRT] Tactic: 0 time 9.19393 [TRT] Tactic: 2 time 9.18672 [TRT] Tactic: 4 skipped. Scratch requested: 52313600, available: 16777216 [TRT] Tactic: 5 time 23.6289 [TRT] Tactic: 57 time 7.91229 [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output. [TRT] Fastest Tactic: 57 Time: 7.91229 [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3456450830548107839 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_0 + Relu_1 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] *************** Autotuning format combination: Half(1,150,22500,67500) -> Half(1,152,23104,739328) *************** [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution) [TRT] CaskConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CudaConvolution) [TRT] Tactic: 0 time 2.8806 [TRT] Tactic: 1 time 3.43776 [TRT] Tactic: 2 time 2.77169 [TRT] Tactic: 4 skipped. Scratch requested: 52313600, available: 16777216 [TRT] Tactic: 5 time 23.2561 [TRT] Fastest Tactic: 2 Time: 2.77169 [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 2 [TRT] [TRT] *************** Autotuning format combination: Half(1,150,22500:2,45000) -> Half(1,152,23104:2,369664) *************** [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution) [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Tactic: 3564772625446233998 time 0.470912 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Tactic: 3650389455493082349 time 0.485807 [TRT] Conv_0 + Relu_1 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 4772821744921268633 time 0.80638 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Tactic: 5319956359050645452 time 0.438359 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Tactic: 7205456024582378848 time 0.6875 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Tactic: -6490690591794140522 time 0.700443 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Tactic: -4686027666808657977 time 1.36534 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Tactic: -4212163711445252890 time 1.31333 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Tactic: -3898373634979201110 time 1.37378 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Tactic: -2409163523992614473 time 0.660312 [TRT] Fastest Tactic: 5319956359050645452 Time: 0.438359 [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CudaConvolution) [TRT] CudaConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_0 + Relu_1 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5319956359050645452 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.443464 [TRT] Tactic: 0 time 0.747864 [TRT] Fastest Tactic: 1002 Time: 0.443464 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 1.59268 [TRT] Tactic: 0 time 0.598516 [TRT] Fastest Tactic: 0 Time: 0.598516 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.463046 [TRT] Tactic: 0 time 0.638932 [TRT] Fastest Tactic: 1002 Time: 0.463046 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 1.59242 [TRT] Tactic: 0 time 0.593255 [TRT] Fastest Tactic: 0 Time: 0.593255 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 1.86437 [TRT] Tactic: 0 time 0.549531 [TRT] Fastest Tactic: 0 Time: 0.549531 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 1.86932 [TRT] Tactic: 0 time 0.538672 [TRT] Fastest Tactic: 0 Time: 0.538672 [TRT] *************** Autotuning format combination: Float(1,152,23104,739328) -> Float(1,76,5776,184832) *************** [TRT] --------------- Timing Runner: MaxPool_2 (Pooling) [TRT] Tactic: -1 time 0.327292 [TRT] Fastest Tactic: -1 Time: 0.327292 [TRT] --------------- Timing Runner: MaxPool_2 (TiledPooling) [TRT] Tactic: 5505281 time 0.741302 [TRT] Tactic: 5570817 time 0.449739 [TRT] Tactic: 5636353 time 0.36612 [TRT] Tactic: 5701889 time 0.31 [TRT] Tactic: 5767425 time 0.302604 [TRT] Tactic: 5832961 time 0.299271 [TRT] Tactic: 5898497 time 0.284349 [TRT] Tactic: 5964033 time 0.275599 [TRT] Tactic: 6029569 time 0.632005 [TRT] Tactic: 6095105 time 0.392291 [TRT] Tactic: 6160641 time 0.317995 [TRT] Tactic: 6226177 time 0.297343 [TRT] Tactic: 6291713 time 0.300573 [TRT] Tactic: 6357249 time 0.300703 [TRT] Tactic: 6422785 time 0.299322 [TRT] Tactic: 6488321 time 0.300469 [TRT] Fastest Tactic: 5964033 Time: 0.275599 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 5964033 [TRT] [TRT] *************** Autotuning format combination: Half(1,152,23104,739328) -> Half(1,76,5776,184832) *************** [TRT] --------------- Timing Runner: MaxPool_2 (Pooling) [TRT] Tactic: -1 time 0.317526 [TRT] Fastest Tactic: -1 Time: 0.317526 [TRT] --------------- Timing Runner: MaxPool_2 (TiledPooling) [TRT] TiledPooling has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -1 [TRT] [TRT] *************** Autotuning format combination: Half(1,152,23104:2,369664) -> Half(1,76,5776:2,92416) *************** [TRT] --------------- Timing Runner: MaxPool_2 (Pooling) [TRT] Tactic: -3 time 0.204583 [TRT] Fastest Tactic: -3 Time: 0.204583 [TRT] --------------- Timing Runner: MaxPool_2 (TiledPooling) [TRT] Tactic: 5505281 time 0.391224 [TRT] Tactic: 5570817 time 0.237474 [TRT] Tactic: 5636353 time 0.209062 [TRT] Tactic: 5701889 time 0.166692 [TRT] Tactic: 5767425 time 0.183724 [TRT] Tactic: 5832961 time 0.155208 [TRT] Tactic: 5898497 time 0.166276 [TRT] Tactic: 5964033 time 0.145182 [TRT] Tactic: 6029569 time 0.327916 [TRT] Tactic: 6095105 time 0.20711 [TRT] Tactic: 6160641 time 0.176744 [TRT] Tactic: 6226177 time 0.147031 [TRT] Tactic: 6291713 time 0.154895 [TRT] Tactic: 6357249 time 0.148334 [TRT] Tactic: 6422785 time 0.153021 [TRT] Tactic: 6488321 time 0.151276 [TRT] Fastest Tactic: 5964033 Time: 0.145182 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 5964033 [TRT] [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.119531 [TRT] Tactic: 0 time 0.194297 [TRT] Fastest Tactic: 1002 Time: 0.119531 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.415703 [TRT] Tactic: 0 time 0.155001 [TRT] Fastest Tactic: 0 Time: 0.155001 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.119609 [TRT] Tactic: 0 time 0.164193 [TRT] Fastest Tactic: 1002 Time: 0.119609 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.409453 [TRT] Tactic: 0 time 0.153489 [TRT] Fastest Tactic: 0 Time: 0.153489 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.475547 [TRT] Tactic: 0 time 0.141302 [TRT] Fastest Tactic: 0 Time: 0.141302 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.480573 [TRT] Tactic: 0 time 0.139843 [TRT] Fastest Tactic: 0 Time: 0.139843 [TRT] *************** Autotuning format combination: Float(1,76,5776,184832) -> Float(1,76,5776,184832) *************** [TRT] --------------- Timing Runner: BatchNormalization_3 (Scale) [TRT] Tactic: 0 time 0.122604 [TRT] Fastest Tactic: 0 Time: 0.122604 [TRT] *************** Autotuning format combination: Half(1,76,5776,184832) -> Half(1,76,5776,184832) *************** [TRT] --------------- Timing Runner: BatchNormalization_3 (Scale) [TRT] Tactic: 0 time 0.112292 [TRT] Fastest Tactic: 0 Time: 0.112292 [TRT] *************** Autotuning format combination: Half(1,76,5776:2,92416) -> Half(1,76,5776:2,92416) *************** [TRT] --------------- Timing Runner: BatchNormalization_3 (Scale) [TRT] Tactic: 0 time 0.13974 [TRT] Fastest Tactic: 0 Time: 0.13974 [TRT] *************** Autotuning format combination: Float(1,76,5776,184832) -> Float(1,78,6084,389376) *************** [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CaskConvolution) [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Tactic: 1062367460111450758 time 1.61255 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Tactic: 3827454225649558724 time 1.52029 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Tactic: 4337000649858996379 time 1.29857 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Tactic: 4501471010995462441 time 2.51073 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Tactic: 5137655947464784826 time 1.2299 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 5921334924264294896 time 1.07982 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Tactic: 6645123197870846056 time 1.26424 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Tactic: 7852627285308570038 time 1.54951 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Tactic: -9137461792520977713 time 2.54422 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Tactic: -8776506421218919509 time 1.50969 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Tactic: -6092040395344634144 time 1.66747 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Tactic: -3456450830548107839 time 1.4306 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Tactic: -2318106587342035239 time 1.52925 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Tactic: -1343271414618805657 time 1.00883 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Tactic: -410470605513481746 time 2.40565 [TRT] Fastest Tactic: -1343271414618805657 Time: 1.00883 [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CudaConvolution) [TRT] Tactic: 0 time 4.38318 [TRT] Tactic: 2 time 4.32161 [TRT] Tactic: 4 skipped. Scratch requested: 276963328, available: 16777216 [TRT] Tactic: 5 time 17.9283 [TRT] Tactic: 57 time 4.38286 [TRT] Fastest Tactic: 2 Time: 4.32161 [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -1343271414618805657 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_4 + Relu_5 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] *************** Autotuning format combination: Half(1,76,5776,184832) -> Half(1,78,6084,389376) *************** [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CaskConvolution) [TRT] CaskConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CudaConvolution) [TRT] Tactic: 0 time 4.38018 [TRT] Tactic: 1 time 4.64662 [TRT] Tactic: 2 time 4.07302 [TRT] Tactic: 4 skipped. Scratch requested: 276963328, available: 16777216 [TRT] Tactic: 5 time 17.7078 [TRT] Fastest Tactic: 2 Time: 4.07302 [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 2 [TRT] [TRT] *************** Autotuning format combination: Half(1,76,5776:2,92416) -> Half(1,78,6084:2,194688) *************** [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CaskConvolution) [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Tactic: 3564772625446233998 time 0.864349 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Tactic: 3650389455493082349 time 0.886433 [TRT] Conv_4 + Relu_5 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 4772821744921268633 time 0.597422 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Tactic: 5319956359050645452 time 0.769349 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Tactic: 7205456024582378848 time 0.668646 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Tactic: -6490690591794140522 time 0.680911 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Tactic: -4686027666808657977 time 1.31362 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Tactic: -4212163711445252890 time 1.255 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Tactic: -3898373634979201110 time 1.30357 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Tactic: -2409163523992614473 time 0.638307 [TRT] Fastest Tactic: 4772821744921268633 Time: 0.597422 [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CudaConvolution) [TRT] CudaConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_4 + Relu_5 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 4772821744921268633 [TRT] Conv_4 + Relu_5 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.239896 [TRT] Tactic: 0 time 0.399427 [TRT] Fastest Tactic: 1002 Time: 0.239896 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.853099 [TRT] Tactic: 0 time 0.318386 [TRT] Fastest Tactic: 0 Time: 0.318386 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.247734 [TRT] Tactic: 0 time 0.340495 [TRT] Fastest Tactic: 1002 Time: 0.247734 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.847526 [TRT] Tactic: 0 time 0.315391 [TRT] Fastest Tactic: 0 Time: 0.315391 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.990755 [TRT] Tactic: 0 time 0.291381 [TRT] Fastest Tactic: 0 Time: 0.291381 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.993776 [TRT] Tactic: 0 time 0.286823 [TRT] Fastest Tactic: 0 Time: 0.286823 [TRT] *************** Autotuning format combination: Float(1,78,6084,389376) -> Float(1,39,1521,97344) *************** [TRT] --------------- Timing Runner: MaxPool_6 (Pooling) [TRT] Tactic: -1 time 0.17362 [TRT] Fastest Tactic: -1 Time: 0.17362 [TRT] --------------- Timing Runner: MaxPool_6 (TiledPooling) [TRT] Tactic: 5505281 time 0.51052 [TRT] Tactic: 5570817 time 0.30836 [TRT] Tactic: 5636353 time 0.248775 [TRT] Tactic: 5701889 time 0.209714 [TRT] Tactic: 5767425 time 0.197656 [TRT] Tactic: 5832961 time 0.192917 [TRT] Tactic: 5898497 time 0.192578 [TRT] Tactic: 5964033 time 0.184296 [TRT] Tactic: 6029569 time 0.395261 [TRT] Tactic: 6095105 time 0.249844 [TRT] Tactic: 6160641 time 0.20664 [TRT] Tactic: 6226177 time 0.196641 [TRT] Tactic: 6291713 time 0.194479 [TRT] Tactic: 6357249 time 0.19052 [TRT] Tactic: 6422785 time 0.19737 [TRT] Tactic: 6488321 time 0.196406 [TRT] Fastest Tactic: 5964033 Time: 0.184296 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -1 [TRT] [TRT] *************** Autotuning format combination: Half(1,78,6084,389376) -> Half(1,39,1521,97344) *************** [TRT] --------------- Timing Runner: MaxPool_6 (Pooling) [TRT] Tactic: -1 time 0.168723 [TRT] Fastest Tactic: -1 Time: 0.168723 [TRT] --------------- Timing Runner: MaxPool_6 (TiledPooling) [TRT] TiledPooling has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -1 [TRT] [TRT] *************** Autotuning format combination: Half(1,78,6084:2,194688) -> Half(1,39,1521:2,48672) *************** [TRT] --------------- Timing Runner: MaxPool_6 (Pooling) [TRT] Tactic: -3 time 0.111953 [TRT] Fastest Tactic: -3 Time: 0.111953 [TRT] --------------- Timing Runner: MaxPool_6 (TiledPooling) [TRT] Tactic: 5505281 time 0.272188 [TRT] Tactic: 5570817 time 0.165755 [TRT] Tactic: 5636353 time 0.136588 [TRT] Tactic: 5701889 time 0.116536 [TRT] Tactic: 5767425 time 0.113958 [TRT] Tactic: 5832961 time 0.110521 [TRT] Tactic: 5898497 time 0.107526 [TRT] Tactic: 5964033 time 0.101641 [TRT] Tactic: 6029569 time 0.208307 [TRT] Tactic: 6095105 time 0.133177 [TRT] Tactic: 6160641 time 0.110625 [TRT] Tactic: 6226177 time 0.10237 [TRT] Tactic: 6291713 time 0.101615 [TRT] Tactic: 6357249 time 0.100808 [TRT] Tactic: 6422785 time 0.101693 [TRT] Tactic: 6488321 time 0.101094 [TRT] Fastest Tactic: 6357249 Time: 0.100808 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 6357249 [TRT] [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.064349 [TRT] Tactic: 0 time 0.105938 [TRT] Fastest Tactic: 1002 Time: 0.064349 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.218958 [TRT] Tactic: 0 time 0.084089 [TRT] Fastest Tactic: 0 Time: 0.084089 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.065677 [TRT] Tactic: 0 time 0.0904425 [TRT] Fastest Tactic: 1002 Time: 0.065677 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.22164 [TRT] Tactic: 0 time 0.0832035 [TRT] Fastest Tactic: 0 Time: 0.0832035 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.263828 [TRT] Tactic: 0 time 0.076979 [TRT] Fastest Tactic: 0 Time: 0.076979 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.264271 [TRT] Tactic: 0 time 0.075833 [TRT] Fastest Tactic: 0 Time: 0.075833 [TRT] *************** Autotuning format combination: Float(1,39,1521,97344) -> Float(1,39,1521,97344) *************** [TRT] --------------- Timing Runner: BatchNormalization_7 (Scale) [TRT] Tactic: 0 time 0.0670575 [TRT] Fastest Tactic: 0 Time: 0.0670575 [TRT] *************** Autotuning format combination: Half(1,39,1521,97344) -> Half(1,39,1521,97344) *************** [TRT] --------------- Timing Runner: BatchNormalization_7 (Scale) [TRT] Tactic: 0 time 0.0619015 [TRT] Fastest Tactic: 0 Time: 0.0619015 [TRT] *************** Autotuning format combination: Half(1,39,1521:2,48672) -> Half(1,39,1521:2,48672) *************** [TRT] --------------- Timing Runner: BatchNormalization_7 (Scale) [TRT] Tactic: 0 time 0.0771355 [TRT] Fastest Tactic: 0 Time: 0.0771355 [TRT] *************** Autotuning format combination: Float(1,39,1521,97344) -> Float(1,41,1681,215168) *************** [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CaskConvolution) [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Tactic: 1062367460111450758 time 1.84461 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Tactic: 3827454225649558724 time 1.73635 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Tactic: 4337000649858996379 time 1.46708 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Tactic: 4501471010995462441 time 1.42583 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Tactic: 5137655947464784826 time 1.37099 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 5921334924264294896 time 1.30016 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Tactic: 6645123197870846056 time 1.44344 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Tactic: 7852627285308570038 time 1.74039 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Tactic: -9137461792520977713 time 1.42865 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Tactic: -8776506421218919509 time 1.69867 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Tactic: -6092040395344634144 time 1.91974 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Tactic: -3456450830548107839 time 1.56372 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Tactic: -2318106587342035239 time 1.71977 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Tactic: -1343271414618805657 time 1.18588 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Tactic: -410470605513481746 time 1.3576 [TRT] Fastest Tactic: -1343271414618805657 Time: 1.18588 [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CudaConvolution) [TRT] Tactic: 0 time 2.58245 [TRT] Tactic: 2 time 2.56112 [TRT] Tactic: 4 skipped. Scratch requested: 279281664, available: 16777216 [TRT] Tactic: 5 skipped. Scratch requested: 37322752, available: 16777216 [TRT] Tactic: 57 time 3.09094 [TRT] Fastest Tactic: 2 Time: 2.56112 [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -1343271414618805657 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_8 + Relu_9 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] *************** Autotuning format combination: Half(1,39,1521,97344) -> Half(1,41,1681,215168) *************** [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CaskConvolution) [TRT] CaskConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CudaConvolution) [TRT] Tactic: 0 time 2.4763 [TRT] Tactic: 1 time 2.94534 [TRT] Tactic: 2 time 2.37289 [TRT] Tactic: 4 skipped. Scratch requested: 279281664, available: 16777216 [TRT] Tactic: 5 skipped. Scratch requested: 37322752, available: 16777216 [TRT] Fastest Tactic: 2 Time: 2.37289 [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 2 [TRT] [TRT] *************** Autotuning format combination: Half(1,39,1521:2,48672) -> Half(1,41,1681:2,107584) *************** [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CaskConvolution) [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Tactic: 3564772625446233998 time 0.969687 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Tactic: 3650389455493082349 time 0.991797 [TRT] Conv_8 + Relu_9 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 4772821744921268633 time 0.705365 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Tactic: 5319956359050645452 time 0.826355 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Tactic: 7205456024582378848 time 0.746015 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Tactic: -6490690591794140522 time 0.755183 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Tactic: -4686027666808657977 time 0.732135 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Tactic: -4212163711445252890 time 0.691641 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Tactic: -3898373634979201110 time 0.729323 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Tactic: -2409163523992614473 time 0.700312 [TRT] Fastest Tactic: -4212163711445252890 Time: 0.691641 [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CudaConvolution) [TRT] CudaConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_8 + Relu_9 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -4212163711445252890 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.134375 [TRT] Tactic: 0 time 0.224323 [TRT] Fastest Tactic: 1002 Time: 0.134375 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.488854 [TRT] Tactic: 0 time 0.17901 [TRT] Fastest Tactic: 0 Time: 0.17901 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.13724 [TRT] Tactic: 0 time 0.190235 [TRT] Fastest Tactic: 1002 Time: 0.13724 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.480937 [TRT] Tactic: 0 time 0.177786 [TRT] Fastest Tactic: 0 Time: 0.177786 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.593699 [TRT] Tactic: 0 time 0.163698 [TRT] Fastest Tactic: 0 Time: 0.163698 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.591719 [TRT] Tactic: 0 time 0.161303 [TRT] Fastest Tactic: 0 Time: 0.161303 [TRT] *************** Autotuning format combination: Float(1,41,1681,215168) -> Float(1,20,400,51200) *************** [TRT] --------------- Timing Runner: MaxPool_10 (Pooling) [TRT] Tactic: -1 time 0.107682 [TRT] Fastest Tactic: -1 Time: 0.107682 [TRT] --------------- Timing Runner: MaxPool_10 (TiledPooling) [TRT] Tactic: 5505281 time 0.258438 [TRT] Tactic: 5570817 time 0.152916 [TRT] Tactic: 5636353 time 0.119479 [TRT] Tactic: 5701889 time 0.101379 [TRT] Tactic: 5767425 time 0.0933595 [TRT] Tactic: 5832961 time 0.087109 [TRT] Tactic: 5898497 time 0.083255 [TRT] Tactic: 5964033 time 0.0798695 [TRT] Tactic: 6029569 time 0.257422 [TRT] Tactic: 6095105 time 0.155963 [TRT] Tactic: 6160641 time 0.124114 [TRT] Tactic: 6226177 time 0.109687 [TRT] Tactic: 6291713 time 0.106615 [TRT] Tactic: 6357249 time 0.102839 [TRT] Tactic: 6422785 time 0.102761 [TRT] Tactic: 6488321 time 0.101771 [TRT] Fastest Tactic: 5964033 Time: 0.0798695 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 5964033 [TRT] [TRT] *************** Autotuning format combination: Half(1,41,1681,215168) -> Half(1,20,400,51200) *************** [TRT] --------------- Timing Runner: MaxPool_10 (Pooling) [TRT] Tactic: -1 time 0.104011 [TRT] Fastest Tactic: -1 Time: 0.104011 [TRT] --------------- Timing Runner: MaxPool_10 (TiledPooling) [TRT] TiledPooling has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -1 [TRT] [TRT] *************** Autotuning format combination: Half(1,41,1681:2,107584) -> Half(1,20,400:2,25600) *************** [TRT] --------------- Timing Runner: MaxPool_10 (Pooling) [TRT] Tactic: -3 time 0.061875 [TRT] Fastest Tactic: -3 Time: 0.061875 [TRT] --------------- Timing Runner: MaxPool_10 (TiledPooling) [TRT] Tactic: 5505281 time 0.139193 [TRT] Tactic: 5570817 time 0.0835155 [TRT] Tactic: 5636353 time 0.068828 [TRT] Tactic: 5701889 time 0.0591145 [TRT] Tactic: 5767425 time 0.0540885 [TRT] Tactic: 5832961 time 0.053151 [TRT] Tactic: 5898497 time 0.0533335 [TRT] Tactic: 5964033 time 0.048932 [TRT] Tactic: 6029569 time 0.138464 [TRT] Tactic: 6095105 time 0.084766 [TRT] Tactic: 6160641 time 0.0703645 [TRT] Tactic: 6226177 time 0.060052 [TRT] Tactic: 6291713 time 0.0569795 [TRT] Tactic: 6357249 time 0.0566405 [TRT] Tactic: 6422785 time 0.0574475 [TRT] Tactic: 6488321 time 0.0544275 [TRT] Fastest Tactic: 5964033 Time: 0.048932 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 5964033 [TRT] [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.036615 [TRT] Tactic: 0 time 0.0591145 [TRT] Fastest Tactic: 1002 Time: 0.036615 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.121458 [TRT] Tactic: 0 time 0.0464585 [TRT] Fastest Tactic: 0 Time: 0.0464585 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0376565 [TRT] Tactic: 0 time 0.04862 [TRT] Fastest Tactic: 1002 Time: 0.0376565 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.113854 [TRT] Tactic: 0 time 0.0452085 [TRT] Fastest Tactic: 0 Time: 0.0452085 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.180678 [TRT] Tactic: 0 time 0.043177 [TRT] Fastest Tactic: 0 Time: 0.043177 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.181823 [TRT] Tactic: 0 time 0.0411455 [TRT] Fastest Tactic: 0 Time: 0.0411455 [TRT] *************** Autotuning format combination: Float(1,20,400,51200) -> Float(1,20,400,51200) *************** [TRT] --------------- Timing Runner: BatchNormalization_11 (Scale) [TRT] Tactic: 0 time 0.038489 [TRT] Fastest Tactic: 0 Time: 0.038489 [TRT] *************** Autotuning format combination: Half(1,20,400,51200) -> Half(1,20,400,51200) *************** [TRT] --------------- Timing Runner: BatchNormalization_11 (Scale) [TRT] Tactic: 0 time 0.0331775 [TRT] Fastest Tactic: 0 Time: 0.0331775 [TRT] *************** Autotuning format combination: Half(1,20,400:2,25600) -> Half(1,20,400:2,25600) *************** [TRT] --------------- Timing Runner: BatchNormalization_11 (Scale) [TRT] Tactic: 0 time 0.041172 [TRT] Fastest Tactic: 0 Time: 0.041172 [TRT] *************** Autotuning format combination: Float(1,20,400,51200) -> Float(1,22,484,123904) *************** [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (FusedConvActConvolution) [TRT] Tactic: 524287 time 2.2999 [TRT] Tactic: 720895 time 1.94784 [TRT] Tactic: 983039 time 2.27773 [TRT] Tactic: 1048575 time 1.99724 [TRT] Tactic: 1703935 time 1.99625 [TRT] Tactic: 1769471 time 2.63417 [TRT] Tactic: 1966079 time 1.83701 [TRT] Tactic: 2031615 time 2.1538 [TRT] Tactic: 2228223 time 2.04737 [TRT] Tactic: 2424831 time 2.46513 [TRT] Tactic: 2621439 time 2.57648 [TRT] Tactic: 2752511 time 2.34542 [TRT] Tactic: 2818047 time 3.25055 [TRT] Tactic: 2883583 time 2.07073 [TRT] Tactic: 3014655 time 2.26826 [TRT] Tactic: 3145727 time 2.34417 [TRT] Tactic: 3473407 time 2.22073 [TRT] Tactic: 3604479 time 2.23458 [TRT] Tactic: 3735551 time 2.74195 [TRT] Tactic: 4390911 time 2.3218 [TRT] Tactic: 5046271 time 2.31935 [TRT] Tactic: 5963775 time 2.22734 [TRT] Tactic: 6160383 time 2.29706 [TRT] Tactic: 6488063 time 2.73143 [TRT] Tactic: 6881279 time 2.38161 [TRT] Tactic: 7274495 time 3.27552 [TRT] Tactic: 7864319 time 2.77518 [TRT] Tactic: 7995391 time 1.91771 [TRT] Tactic: 8585215 time 2.31401 [TRT] Tactic: 8847359 time 3.12518 [TRT] Tactic: 8978431 time 2.23229 [TRT] Tactic: 9043967 time 2.73065 [TRT] Tactic: 9175039 time 2.23169 [TRT] Tactic: 9502719 time 2.34177 [TRT] Tactic: 9830399 time 2.47469 [TRT] Tactic: 9961471 time 2.59516 [TRT] Tactic: 10027007 time 2.30901 [TRT] Tactic: 10092543 time 2.33539 [TRT] Tactic: 10289151 time 1.83565 [TRT] Tactic: 10485759 time 2.34346 [TRT] Tactic: 10682367 time 2.4988 [TRT] Tactic: 10813439 time 1.93966 [TRT] Fastest Tactic: 10289151 Time: 1.83565 [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CaskConvolution) [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Tactic: 1062367460111450758 time 2.2568 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Tactic: 3827454225649558724 time 1.95268 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Tactic: 4337000649858996379 time 1.69807 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Tactic: 4501471010995462441 time 1.61885 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Tactic: 5137655947464784826 time 1.5126 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 5921334924264294896 time 1.48937 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Tactic: 6645123197870846056 time 1.65888 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Tactic: 7852627285308570038 time 1.95094 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Tactic: -9137461792520977713 time 1.63964 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Tactic: -8776506421218919509 time 1.88599 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Tactic: -6092040395344634144 time 2.34307 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Tactic: -3456450830548107839 time 1.79417 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Tactic: -2318106587342035239 time 1.90104 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Tactic: -1343271414618805657 time 1.34315 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Tactic: -410470605513481746 time 1.51081 [TRT] Fastest Tactic: -1343271414618805657 Time: 1.34315 [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CudaConvolution) [TRT] Tactic: 0 time 3.0112 [TRT] Tactic: 2 time 2.5924 [TRT] Tactic: 4 skipped. Scratch requested: 287506432, available: 16777216 [TRT] Tactic: 5 skipped. Scratch requested: 144277504, available: 16777216 [TRT] Tactic: 57 time 4.17471 [TRT] Fastest Tactic: 2 Time: 2.5924 [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -1343271414618805657 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_12 + Relu_13 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] *************** Autotuning format combination: Half(1,20,400,51200) -> Half(1,22,484,123904) *************** [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CaskConvolution) [TRT] CaskConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CudaConvolution) [TRT] Tactic: 0 time 3.02888 [TRT] Tactic: 1 time 3.02664 [TRT] Tactic: 2 time 2.51828 [TRT] Tactic: 4 skipped. Scratch requested: 287506432, available: 16777216 [TRT] Tactic: 5 skipped. Scratch requested: 144277504, available: 16777216 [TRT] Fastest Tactic: 2 Time: 2.51828 [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 2 [TRT] [TRT] *************** Autotuning format combination: Half(1,20,400:2,25600) -> Half(1,22,484:2,61952) *************** [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (FusedConvActConvolution) [TRT] Tactic: 524287 time 1.2976 [TRT] Tactic: 720895 time 1.11698 [TRT] Tactic: 983039 time 1.17148 [TRT] Tactic: 1048575 time 1.23005 [TRT] Tactic: 1703935 time 1.16289 [TRT] Tactic: 1769471 time 8.94516 [TRT] Tactic: 1966079 time 1.11589 [TRT] Tactic: 2031615 time 1.15349 [TRT] Tactic: 2228223 time 1.33578 [TRT] Tactic: 2424831 time 1.79682 [TRT] Tactic: 2621439 time 1.41435 [TRT] Tactic: 2752511 time 1.33435 [TRT] Tactic: 2818047 time 1.79096 [TRT] Tactic: 2883583 time 1.21221 [TRT] Tactic: 3014655 time 1.29112 [TRT] Tactic: 3145727 time 1.34182 [TRT] Tactic: 3473407 time 1.27096 [TRT] Tactic: 3604479 time 1.27437 [TRT] Tactic: 3735551 time 1.53125 [TRT] Tactic: 4390911 time 1.34466 [TRT] Tactic: 5046271 time 1.2944 [TRT] Tactic: 5963775 time 1.16857 [TRT] Tactic: 6160383 time 1.35878 [TRT] Tactic: 6488063 time 1.53935 [TRT] Tactic: 6881279 time 1.31156 [TRT] Tactic: 7274495 time 1.8125 [TRT] Tactic: 7864319 time 1.56214 [TRT] Tactic: 7995391 time 1.13213 [TRT] Tactic: 8585215 time 1.28581 [TRT] Tactic: 8847359 time 1.7025 [TRT] Tactic: 8978431 time 1.24146 [TRT] Tactic: 9043967 time 1.56096 [TRT] Tactic: 9175039 time 1.26977 [TRT] Tactic: 9502719 time 1.31724 [TRT] Tactic: 9830399 time 1.2907 [TRT] Tactic: 9961471 time 1.45654 [TRT] Tactic: 10027007 time 1.31896 [TRT] Tactic: 10092543 time 1.34656 [TRT] Tactic: 10289151 time 1.11508 [TRT] Tactic: 10485759 time 1.34276 [TRT] Tactic: 10682367 time 1.37164 [TRT] Tactic: 10813439 time 1.15984 [TRT] Fastest Tactic: 10289151 Time: 1.11508 [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CaskConvolution) [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Tactic: 3564772625446233998 time 1.15878 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Tactic: 3650389455493082349 time 1.19878 [TRT] Conv_12 + Relu_13 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 4772821744921268633 time 0.793907 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Tactic: 5319956359050645452 time 0.912865 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Tactic: 7205456024582378848 time 0.858411 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Tactic: -6490690591794140522 time 0.877864 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Tactic: -4686027666808657977 time 0.824636 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Tactic: -4212163711445252890 time 0.766537 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Tactic: -3898373634979201110 time 0.804505 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Tactic: -2409163523992614473 time 0.780912 [TRT] Fastest Tactic: -4212163711445252890 Time: 0.766537 [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CudaConvolution) [TRT] CudaConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_12 + Relu_13 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -4212163711445252890 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0804425 [TRT] Tactic: 0 time 0.132969 [TRT] Fastest Tactic: 1002 Time: 0.0804425 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.294843 [TRT] Tactic: 0 time 0.105339 [TRT] Fastest Tactic: 0 Time: 0.105339 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0835935 [TRT] Tactic: 0 time 0.113126 [TRT] Fastest Tactic: 1002 Time: 0.0835935 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.291094 [TRT] Tactic: 0 time 0.104375 [TRT] Fastest Tactic: 0 Time: 0.104375 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.357474 [TRT] Tactic: 0 time 0.0976305 [TRT] Fastest Tactic: 0 Time: 0.0976305 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.357969 [TRT] Tactic: 0 time 0.095052 [TRT] Fastest Tactic: 0 Time: 0.095052 [TRT] *************** Autotuning format combination: Float(1,22,484,123904) -> Float(1,11,121,30976) *************** [TRT] --------------- Timing Runner: MaxPool_14 (Pooling) [TRT] Tactic: -1 time 0.06375 [TRT] Fastest Tactic: -1 Time: 0.06375 [TRT] --------------- Timing Runner: MaxPool_14 (TiledPooling) [TRT] Tactic: 5505281 time 0.299322 [TRT] Tactic: 5570817 time 0.178647 [TRT] Tactic: 5636353 time 0.13888 [TRT] Tactic: 5701889 time 0.117552 [TRT] Tactic: 5767425 time 0.107292 [TRT] Tactic: 5832961 time 0.0984895 [TRT] Tactic: 5898497 time 0.0923175 [TRT] Tactic: 5964033 time 0.0902865 [TRT] Tactic: 6029569 time 0.154766 [TRT] Tactic: 6095105 time 0.0925 [TRT] Tactic: 6160641 time 0.071406 [TRT] Tactic: 6226177 time 0.0621095 [TRT] Tactic: 6291713 time 0.057136 [TRT] Tactic: 6357249 time 0.052578 [TRT] Tactic: 6422785 time 0.0507025 [TRT] Tactic: 6488321 time 0.050703 [TRT] Fastest Tactic: 6422785 Time: 0.0507025 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 6422785 [TRT] [TRT] *************** Autotuning format combination: Half(1,22,484,123904) -> Half(1,11,121,30976) *************** [TRT] --------------- Timing Runner: MaxPool_14 (Pooling) [TRT] Tactic: -1 time 0.060625 [TRT] Fastest Tactic: -1 Time: 0.060625 [TRT] --------------- Timing Runner: MaxPool_14 (TiledPooling) [TRT] TiledPooling has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -1 [TRT] [TRT] *************** Autotuning format combination: Half(1,22,484:2,61952) -> Half(1,11,121:2,15488) *************** [TRT] --------------- Timing Runner: MaxPool_14 (Pooling) [TRT] Tactic: -3 time 0.0407815 [TRT] Fastest Tactic: -3 Time: 0.0407815 [TRT] --------------- Timing Runner: MaxPool_14 (TiledPooling) [TRT] Tactic: 5505281 time 0.158985 [TRT] Tactic: 5570817 time 0.0979425 [TRT] Tactic: 5636353 time 0.0775 [TRT] Tactic: 5701889 time 0.0684115 [TRT] Tactic: 5767425 time 0.063359 [TRT] Tactic: 5832961 time 0.0602865 [TRT] Tactic: 5898497 time 0.0583335 [TRT] Tactic: 5964033 time 0.054896 [TRT] Tactic: 6029569 time 0.0833855 [TRT] Tactic: 6095105 time 0.051693 [TRT] Tactic: 6160641 time 0.0411455 [TRT] Tactic: 6226177 time 0.036797 [TRT] Tactic: 6291713 time 0.034323 [TRT] Tactic: 6357249 time 0.034115 [TRT] Tactic: 6422785 time 0.0320575 [TRT] Tactic: 6488321 time 0.031901 [TRT] Fastest Tactic: 6488321 Time: 0.031901 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 6488321 [TRT] [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0179685 [TRT] Tactic: 0 time 0.0354945 [TRT] Fastest Tactic: 1002 Time: 0.0179685 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0753905 [TRT] Tactic: 0 time 0.0286455 [TRT] Fastest Tactic: 0 Time: 0.0286455 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.017709 [TRT] Tactic: 0 time 0.0309375 [TRT] Fastest Tactic: 1002 Time: 0.017709 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0767185 [TRT] Tactic: 0 time 0.028281 [TRT] Fastest Tactic: 0 Time: 0.028281 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.142057 [TRT] Tactic: 0 time 0.026927 [TRT] Fastest Tactic: 0 Time: 0.026927 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.140989 [TRT] Tactic: 0 time 0.025807 [TRT] Fastest Tactic: 0 Time: 0.025807 [TRT] *************** Autotuning format combination: Float(1,11,121,30976) -> Float(1,11,121,30976) *************** [TRT] --------------- Timing Runner: BatchNormalization_15 (Scale) [TRT] Tactic: 0 time 0.0220315 [TRT] Fastest Tactic: 0 Time: 0.0220315 [TRT] *************** Autotuning format combination: Half(1,11,121,30976) -> Half(1,11,121,30976) *************** [TRT] --------------- Timing Runner: BatchNormalization_15 (Scale) [TRT] Tactic: 0 time 0.0208595 [TRT] Fastest Tactic: 0 Time: 0.0208595 [TRT] *************** Autotuning format combination: Half(1,11,121:2,15488) -> Half(1,11,121:2,15488) *************** [TRT] --------------- Timing Runner: BatchNormalization_15 (Scale) [TRT] Tactic: 0 time 0.0266925 [TRT] Fastest Tactic: 0 Time: 0.0266925 [TRT] *************** Autotuning format combination: Float(1,11,121,30976) -> Float(1,13,169,86528) *************** [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (FusedConvActConvolution) [TRT] Tactic: 524287 time 3.05667 [TRT] Tactic: 720895 time 3.38492 [TRT] Tactic: 983039 time 2.88036 [TRT] Tactic: 1048575 time 3.13034 [TRT] Tactic: 1703935 time 3.0813 [TRT] Tactic: 1769471 time 2.95406 [TRT] Tactic: 1966079 time 3.0826 [TRT] Tactic: 2031615 time 2.97141 [TRT] Tactic: 2228223 time 3.93563 [TRT] Tactic: 2424831 time 4.51727 [TRT] Tactic: 2621439 time 3.72518 [TRT] Tactic: 2752511 time 2.76023 [TRT] Tactic: 2818047 time 3.83164 [TRT] Tactic: 2883583 time 3.13404 [TRT] Tactic: 3014655 time 2.86898 [TRT] Tactic: 3145727 time 2.69865 [TRT] Tactic: 3473407 time 3.24049 [TRT] Tactic: 3604479 time 2.82893 [TRT] Tactic: 3735551 time 4.2574 [TRT] Tactic: 4390911 time 2.65828 [TRT] Tactic: 5046271 time 3.03529 [TRT] Tactic: 5963775 time 2.75901 [TRT] Tactic: 6160383 time 2.993 [TRT] Tactic: 6488063 time 2.66607 [TRT] Tactic: 6881279 time 2.69448 [TRT] Tactic: 7274495 time 3.78963 [TRT] Tactic: 7864319 time 3.81927 [TRT] Tactic: 7995391 time 3.33451 [TRT] Tactic: 8585215 time 2.75034 [TRT] Tactic: 8847359 time 2.90221 [TRT] Tactic: 8978431 time 2.81969 [TRT] Tactic: 9043967 time 2.62841 [TRT] Tactic: 9175039 time 2.82018 [TRT] Tactic: 9502719 time 2.69122 [TRT] Tactic: 9830399 time 3.00227 [TRT] Tactic: 9961471 time 3.20425 [TRT] Tactic: 10027007 time 2.73328 [TRT] Tactic: 10092543 time 2.66458 [TRT] Tactic: 10289151 time 3.08266 [TRT] Tactic: 10485759 time 2.73208 [TRT] Tactic: 10682367 time 3.59586 [TRT] Tactic: 10813439 time 3.26917 [TRT] Fastest Tactic: 9043967 Time: 2.62841 [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CaskConvolution) [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Tactic: 1062367460111450758 time 4.65547 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Tactic: 3827454225649558724 time 2.37398 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Tactic: 4337000649858996379 time 3.37175 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Tactic: 4501471010995462441 time 3.20854 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Tactic: 5137655947464784826 time 3.01911 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 5921334924264294896 time 1.79451 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Tactic: 6645123197870846056 time 3.33612 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Tactic: 7852627285308570038 time 2.23479 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Tactic: -9137461792520977713 time 3.2369 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Tactic: -8776506421218919509 time 2.19734 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Tactic: -6092040395344634144 time 4.83919 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Tactic: -3456450830548107839 time 3.52148 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Tactic: -2318106587342035239 time 2.17203 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Tactic: -1343271414618805657 time 1.64188 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Tactic: -410470605513481746 time 2.96763 [TRT] Fastest Tactic: -1343271414618805657 Time: 1.64188 [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CudaConvolution) [TRT] Tactic: 0 time 4.07609 [TRT] Tactic: 2 time 3.48346 [TRT] Tactic: 4 skipped. Scratch requested: 307298304, available: 16777216 [TRT] Tactic: 5 skipped. Scratch requested: 573767680, available: 16777216 [TRT] Tactic: 57 time 6.72055 [TRT] Fastest Tactic: 2 Time: 3.48346 [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -1343271414618805657 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x64_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148n_nt_v0 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v0 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] Conv_16 + Relu_17 (scudnn) Set Tactic Name: maxwell_scudnn_128x128_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (scudnn_winograd) Set Tactic Name: maxwell_scudnn_winograd_128x128_ldg1_ldg4_mobile_relu_tile148t_nt_v0 [TRT] *************** Autotuning format combination: Half(1,11,121,30976) -> Half(1,13,169,86528) *************** [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (FusedConvActConvolution) [TRT] FusedConvActConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CaskConvolution) [TRT] CaskConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CudaConvolution) [TRT] Tactic: 0 time 4.0595 [TRT] Tactic: 1 time 4.0506 [TRT] Tactic: 2 time 3.38617 [TRT] Tactic: 4 skipped. Scratch requested: 307298304, available: 16777216 [TRT] Tactic: 5 skipped. Scratch requested: 573767680, available: 16777216 [TRT] Fastest Tactic: 2 Time: 3.38617 [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CudaConvolution Tactic: 2 [TRT] [TRT] *************** Autotuning format combination: Half(1,11,121:2,15488) -> Half(1,13,169:2,43264) *************** [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (FusedConvActConvolution) [TRT] Tactic: 524287 time 1.71109 [TRT] Tactic: 720895 time 1.89109 [TRT] Tactic: 983039 time 1.47531 [TRT] Tactic: 1048575 time 1.835 [TRT] Tactic: 1703935 time 1.73888 [TRT] Tactic: 1769471 time 10.5511 [TRT] Tactic: 1966079 time 1.92138 [TRT] Tactic: 2031615 time 1.61677 [TRT] Tactic: 2228223 time 2.41044 [TRT] Tactic: 2424831 time 3.23289 [TRT] Tactic: 2621439 time 1.99563 [TRT] Tactic: 2752511 time 1.50622 [TRT] Tactic: 2818047 time 2.01643 [TRT] Tactic: 2883583 time 1.79797 [TRT] Tactic: 3014655 time 1.60135 [TRT] Tactic: 3145727 time 1.48422 [TRT] Tactic: 3473407 time 1.84245 [TRT] Tactic: 3604479 time 1.59391 [TRT] Tactic: 3735551 time 2.20859 [TRT] Tactic: 4390911 time 1.53518 [TRT] Tactic: 5046271 time 1.65807 [TRT] Tactic: 5963775 time 1.50703 [TRT] Tactic: 6160383 time 1.75826 [TRT] Tactic: 6488063 time 1.46052 [TRT] Tactic: 6881279 time 1.63453 [TRT] Tactic: 7274495 time 2.14826 [TRT] Tactic: 7864319 time 2.08732 [TRT] Tactic: 7995391 time 1.90662 [TRT] Tactic: 8585215 time 1.52797 [TRT] Tactic: 8847359 time 1.5037 [TRT] Tactic: 8978431 time 1.61995 [TRT] Tactic: 9043967 time 1.45031 [TRT] Tactic: 9175039 time 1.59461 [TRT] Tactic: 9502719 time 1.51711 [TRT] Tactic: 9830399 time 1.5774 [TRT] Tactic: 9961471 time 1.75117 [TRT] Tactic: 10027007 time 1.51984 [TRT] Tactic: 10092543 time 1.55604 [TRT] Tactic: 10289151 time 1.92867 [TRT] Tactic: 10485759 time 1.50794 [TRT] Tactic: 10682367 time 1.94992 [TRT] Tactic: 10813439 time 1.9176 [TRT] Fastest Tactic: 9043967 Time: 1.45031 [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CaskConvolution) [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Tactic: 3564772625446233998 time 2.40781 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Tactic: 3650389455493082349 time 2.43021 [TRT] Conv_16 + Relu_17 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Tactic: 4772821744921268633 time 0.945598 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Tactic: 5319956359050645452 time 1.75599 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Tactic: 7205456024582378848 time 1.68943 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Tactic: -6490690591794140522 time 1.7025 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Tactic: -4686027666808657977 time 1.62852 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Tactic: -4212163711445252890 time 1.48846 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Tactic: -3898373634979201110 time 1.61662 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Tactic: -2409163523992614473 time 1.52003 [TRT] Fastest Tactic: 4772821744921268633 Time: 0.945598 [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CudaConvolution) [TRT] CudaConvolution has no valid tactics for this config, skipping [TRT] --------------- Timing Runner: Conv_16 + Relu_17 (CudaDepthwiseConvolution) [TRT] CudaDepthwiseConvolution has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 4772821744921268633 [TRT] Conv_16 + Relu_17 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_large_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_medium_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x64_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.057839 [TRT] Tactic: 0 time 0.0953385 [TRT] Fastest Tactic: 1002 Time: 0.057839 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.221068 [TRT] Tactic: 0 time 0.0752345 [TRT] Fastest Tactic: 0 Time: 0.0752345 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0599215 [TRT] Tactic: 0 time 0.0797915 [TRT] Fastest Tactic: 1002 Time: 0.0599215 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.223022 [TRT] Tactic: 0 time 0.074609 [TRT] Fastest Tactic: 0 Time: 0.074609 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.318151 [TRT] Tactic: 0 time 0.068802 [TRT] Fastest Tactic: 0 Time: 0.068802 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.320468 [TRT] Tactic: 0 time 0.0684895 [TRT] Fastest Tactic: 0 Time: 0.0684895 [TRT] *************** Autotuning format combination: Float(1,13,169,86528) -> Float(1,6,36,18432) *************** [TRT] --------------- Timing Runner: MaxPool_18 (Pooling) [TRT] Tactic: -1 time 0.045729 [TRT] Fastest Tactic: -1 Time: 0.045729 [TRT] --------------- Timing Runner: MaxPool_18 (TiledPooling) [TRT] Tactic: 5505281 time 0.299662 [TRT] Tactic: 5570817 time 0.180287 [TRT] Tactic: 5636353 time 0.139193 [TRT] Tactic: 5701889 time 0.12026 [TRT] Tactic: 5767425 time 0.109193 [TRT] Tactic: 5832961 time 0.100678 [TRT] Tactic: 5898497 time 0.096406 [TRT] Tactic: 5964033 time 0.0942445 [TRT] Tactic: 6029569 time 0.155365 [TRT] Tactic: 6095105 time 0.092891 [TRT] Tactic: 6160641 time 0.0727085 [TRT] Tactic: 6226177 time 0.0643235 [TRT] Tactic: 6291713 time 0.058958 [TRT] Tactic: 6357249 time 0.0551305 [TRT] Tactic: 6422785 time 0.053542 [TRT] Tactic: 6488321 time 0.051771 [TRT] Fastest Tactic: 6488321 Time: 0.051771 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -1 [TRT] [TRT] *************** Autotuning format combination: Half(1,13,169,86528) -> Half(1,6,36,18432) *************** [TRT] --------------- Timing Runner: MaxPool_18 (Pooling) [TRT] Tactic: -1 time 0.036146 [TRT] Fastest Tactic: -1 Time: 0.036146 [TRT] --------------- Timing Runner: MaxPool_18 (TiledPooling) [TRT] TiledPooling has no valid tactics for this config, skipping [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -1 [TRT] [TRT] *************** Autotuning format combination: Half(1,13,169:2,43264) -> Half(1,6,36:2,9216) *************** [TRT] --------------- Timing Runner: MaxPool_18 (Pooling) [TRT] Tactic: -3 time 0.023125 [TRT] Fastest Tactic: -3 Time: 0.023125 [TRT] --------------- Timing Runner: MaxPool_18 (TiledPooling) [TRT] Tactic: 5505281 time 0.159714 [TRT] Tactic: 5570817 time 0.099818 [TRT] Tactic: 5636353 time 0.0804685 [TRT] Tactic: 5701889 time 0.071224 [TRT] Tactic: 5767425 time 0.067578 [TRT] Tactic: 5832961 time 0.063802 [TRT] Tactic: 5898497 time 0.0624475 [TRT] Tactic: 5964033 time 0.0596355 [TRT] Tactic: 6029569 time 0.0838545 [TRT] Tactic: 6095105 time 0.052057 [TRT] Tactic: 6160641 time 0.0427345 [TRT] Tactic: 6226177 time 0.03849 [TRT] Tactic: 6291713 time 0.0366145 [TRT] Tactic: 6357249 time 0.0359895 [TRT] Tactic: 6422785 time 0.03599 [TRT] Tactic: 6488321 time 0.035911 [TRT] Fastest Tactic: 6488321 Time: 0.035911 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: Pooling Tactic: -3 [TRT] [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.012474 [TRT] Tactic: 0 time 0.021641 [TRT] Fastest Tactic: 1002 Time: 0.012474 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0129945 [TRT] Tactic: 0 time 0.0203385 [TRT] Fastest Tactic: 1002 Time: 0.0129945 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.243724 [TRT] Tactic: 0 time 0.017162 [TRT] Fastest Tactic: 0 Time: 0.017162 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.243099 [TRT] Tactic: 0 time 0.0173955 [TRT] Fastest Tactic: 0 Time: 0.0173955 [TRT] *************** Autotuning format combination: Float(1,6,36,18432) -> Float(1,18432) *************** [TRT] --------------- Timing Runner: Reshape_20 (Shuffle) [TRT] Tactic: 0 time 0.008412 [TRT] Tactic: 1 time 0.0222395 [TRT] Fastest Tactic: 0 Time: 0.008412 [TRT] *************** Autotuning format combination: Half(1,6,36,18432) -> Half(1,18432) *************** [TRT] --------------- Timing Runner: Reshape_20 (Shuffle) [TRT] Tactic: 0 time 0.0055205 [TRT] Tactic: 1 time 0.022396 [TRT] Fastest Tactic: 0 Time: 0.0055205 [TRT] *************** Autotuning format combination: -> Float(1,512) *************** [TRT] *************** Autotuning format combination: -> Half(1,512) *************** [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0120055 [TRT] Tactic: 0 time 0.022891 [TRT] Fastest Tactic: 1002 Time: 0.0120055 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0123955 [TRT] Tactic: 0 time 0.020391 [TRT] Fastest Tactic: 1002 Time: 0.0123955 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 6.4838 [TRT] Tactic: 0 time 9.43708 [TRT] Fastest Tactic: 1002 Time: 6.4838 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 6.74148 [TRT] Tactic: 0 time 8.06297 [TRT] Fastest Tactic: 1002 Time: 6.74148 [TRT] *************** Autotuning format combination: Float(1,18432), Float(1,512) -> Float(1,512) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 22) [Matrix Multiply] (MatrixMultiply) [TRT] Tactic: 0 time 2.58898 [TRT] Fastest Tactic: 0 Time: 2.58898 [TRT] *************** Autotuning format combination: Half(1,18432), Half(1,512) -> Half(1,512) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 22) [Matrix Multiply] (MatrixMultiply) [TRT] Tactic: 0 time 1.51661 [TRT] Fastest Tactic: 0 Time: 1.51661 [TRT] *************** Autotuning format combination: -> Float(1,512) *************** [TRT] *************** Autotuning format combination: -> Half(1,512) *************** [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.005781 [TRT] Tactic: 0 time 0.0059115 [TRT] Fastest Tactic: 1002 Time: 0.005781 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0059115 [TRT] Tactic: 0 time 0.005703 [TRT] Fastest Tactic: 0 Time: 0.005703 [TRT] *************** Autotuning format combination: Float(1,512), Float(1,512) -> Float(1,512) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 25) [ElementWise] + Relu_22 (ElementWise) [TRT] Tactic: 1 time 0.003437 [TRT] Tactic: 2 time 0.0083595 [TRT] Fastest Tactic: 1 Time: 0.003437 [TRT] *************** Autotuning format combination: Half(1,512), Half(1,512) -> Half(1,512) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 25) [ElementWise] + Relu_22 (ElementWise) [TRT] Tactic: 1 time 0.0034375 [TRT] Tactic: 2 time 0.005807 [TRT] Fastest Tactic: 1 Time: 0.0034375 [TRT] *************** Autotuning format combination: -> Float(1,100) *************** [TRT] *************** Autotuning format combination: -> Half(1,100) *************** [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0433855 [TRT] Tactic: 0 time 0.06013 [TRT] Fastest Tactic: 1002 Time: 0.0433855 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0438805 [TRT] Tactic: 0 time 0.050026 [TRT] Fastest Tactic: 1002 Time: 0.0438805 [TRT] *************** Autotuning format combination: Float(1,512), Float(1,100) -> Float(1,100) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 29) [Matrix Multiply] (MatrixMultiply) [TRT] Tactic: 0 time 0.08099 [TRT] Fastest Tactic: 0 Time: 0.08099 [TRT] *************** Autotuning format combination: Half(1,512), Half(1,100) -> Half(1,100) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 29) [Matrix Multiply] (MatrixMultiply) [TRT] Tactic: 0 time 0.018047 [TRT] Fastest Tactic: 0 Time: 0.018047 [TRT] *************** Autotuning format combination: -> Float(1,100) *************** [TRT] *************** Autotuning format combination: -> Half(1,100) *************** [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.005886 [TRT] Tactic: 0 time 0.005912 [TRT] Fastest Tactic: 1002 Time: 0.005886 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.005625 [TRT] Tactic: 0 time 0.005521 [TRT] Fastest Tactic: 0 Time: 0.005521 [TRT] *************** Autotuning format combination: Float(1,100), Float(1,100) -> Float(1,100) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 32) [ElementWise] + Relu_24 (ElementWise) [TRT] Tactic: 1 time 0.0033335 [TRT] Tactic: 2 time 0.005781 [TRT] Fastest Tactic: 1 Time: 0.0033335 [TRT] *************** Autotuning format combination: Half(1,100), Half(1,100) -> Half(1,100) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 32) [ElementWise] + Relu_24 (ElementWise) [TRT] Tactic: 1 time 0.0033855 [TRT] Tactic: 2 time 0.0046615 [TRT] Fastest Tactic: 1 Time: 0.0033855 [TRT] *************** Autotuning format combination: -> Float(1,2) *************** [TRT] *************** Autotuning format combination: -> Half(1,2) *************** [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0057555 [TRT] Tactic: 0 time 0.0055985 [TRT] Fastest Tactic: 0 Time: 0.0055985 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.005885 [TRT] Tactic: 0 time 0.005651 [TRT] Fastest Tactic: 0 Time: 0.005651 [TRT] *************** Autotuning format combination: Float(1,100), Float(1,2) -> Float(1,2) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 36) [Matrix Multiply] (MatrixMultiply) [TRT] Tactic: 0 time 0.006536 [TRT] Fastest Tactic: 0 Time: 0.006536 [TRT] *************** Autotuning format combination: Half(1,100), Half(1,2) -> Half(1,2) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 36) [Matrix Multiply] (MatrixMultiply) [TRT] Tactic: 0 time 0.0172655 [TRT] Fastest Tactic: 0 Time: 0.0172655 [TRT] *************** Autotuning format combination: -> Float(1,2) *************** [TRT] *************** Autotuning format combination: -> Half(1,2) *************** [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0056255 [TRT] Tactic: 0 time 0.003438 [TRT] Fastest Tactic: 0 Time: 0.003438 [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.005755 [TRT] Tactic: 0 time 0.0034375 [TRT] Fastest Tactic: 0 Time: 0.0034375 [TRT] *************** Autotuning format combination: Float(1,2), Float(1,2) -> Float(1,2) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 39) [ElementWise] + Relu_26 (ElementWise) [TRT] Tactic: 1 time 0.003333 [TRT] Fastest Tactic: 1 Time: 0.003333 [TRT] *************** Autotuning format combination: Half(1,2), Half(1,2) -> Half(1,2) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 39) [ElementWise] + Relu_26 (ElementWise) [TRT] Tactic: 1 time 0.003359 [TRT] Fastest Tactic: 1 Time: 0.003359 [TRT] *************** Autotuning format combination: Float(1,2) -> Float(1,2) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 42) [Softmax] (SoftMax) [TRT] Tactic: 1001 time 0.008099 [TRT] Fastest Tactic: 1001 Time: 0.008099 [TRT] --------------- Timing Runner: (Unnamed Layer* 42) [Softmax] (ExtSoftMax) [TRT] Tactic: 0 time 0.005677 [TRT] Fastest Tactic: 0 Time: 0.005677 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: ExtSoftMax Tactic: 0 [TRT] [TRT] *************** Autotuning format combination: Half(1,2) -> Half(1,2) *************** [TRT] --------------- Timing Runner: (Unnamed Layer* 42) [Softmax] (SoftMax) [TRT] Tactic: 1001 time 0.008177 [TRT] Fastest Tactic: 1001 Time: 0.008177 [TRT] --------------- Timing Runner: (Unnamed Layer* 42) [Softmax] (ExtSoftMax) [TRT] Tactic: 0 time 0.0057035 [TRT] Fastest Tactic: 0 Time: 0.0057035 [TRT] >>>>>>>>>>>>>>> Chose Runner Type: ExtSoftMax Tactic: 0 [TRT] [TRT] --------------- Timing Runner: (Reformat) [TRT] Tactic: 1002 time 0.0057815 [TRT] Tactic: 0 time 0.003307 [TRT] Fastest Tactic: 0 Time: 0.003307 [TRT] Adding reformat layer: Conv_0 + Relu_1 reformatted input 0 (input.1) from Float(1,150,22500,67500) to Half(1,150,22500:2,45000) [TRT] Adding reformat layer: Reshape_20 reformatted input 0 (67) from Half(1,6,36:2,9216) to Half(1,6,36,18432) [TRT] Adding reformat layer: (Unnamed Layer* 32) [ElementWise] + Relu_24 reformatted input 0 ((Unnamed Layer* 29) [Matrix Multiply]_output) from Half(1,100) to Float(1,100) [TRT] For layer (Unnamed Layer* 30) [Constant] + (Unnamed Layer* 31) [Shuffle] a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TRT] For layer (Unnamed Layer* 32) [ElementWise] + Relu_24 a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TRT] For layer (Unnamed Layer* 35) [Constant] a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TRT] For layer (Unnamed Layer* 36) [Matrix Multiply] a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TRT] For layer (Unnamed Layer* 37) [Constant] + (Unnamed Layer* 38) [Shuffle] a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TRT] For layer (Unnamed Layer* 39) [ElementWise] + Relu_26 a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TRT] For layer (Unnamed Layer* 42) [Softmax] a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TRT] Formats and tactics selection completed in 19.1517 seconds. [TRT] After reformat layers: 31 layers [TRT] Block size 16777216 [TRT] Block size 1478656 [TRT] Block size 369664 [TRT] Block size 369664 [TRT] Total Activation Memory: 18995200 [TRT] Detected 1 inputs and 1 output network tensors. [TRT] Conv_0 + Relu_1 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x32_relu_small_nn_v1 [TRT] Conv_4 + Relu_5 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Conv_8 + Relu_9 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_12 + Relu_13 (hcudnn) Set Tactic Name: maxwell_fp16x2_hcudnn_fp16x2_128x128_relu_small_nn_v1 [TRT] Conv_16 + Relu_17 (hcudnn_winograd) Set Tactic Name: maxwell_fp16x2_hcudnn_winograd_fp16x2_128x128_ldg1_ldg4_relu_tile148m_nt_v1 [TRT] Layer: Conv_0 + Relu_1 input reformatter 0 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: Conv_0 + Relu_1 Weights: 0 HostPersistent: 1664 DevicePersistent: 141312 [TRT] Layer: MaxPool_2 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: BatchNormalization_3 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: Conv_4 + Relu_5 Weights: 0 HostPersistent: 512 DevicePersistent: 102912 [TRT] Layer: MaxPool_6 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: BatchNormalization_7 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: Conv_8 + Relu_9 Weights: 0 HostPersistent: 1664 DevicePersistent: 158208 [TRT] Layer: MaxPool_10 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: BatchNormalization_11 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: Conv_12 + Relu_13 Weights: 0 HostPersistent: 1664 DevicePersistent: 593408 [TRT] Layer: MaxPool_14 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: BatchNormalization_15 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: Conv_16 + Relu_17 Weights: 0 HostPersistent: 512 DevicePersistent: 6554624 [TRT] Layer: MaxPool_18 Weights: 0 HostPersistent: 16 DevicePersistent: 0 [TRT] Layer: Reshape_20 input reformatter 0 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 21) [Constant] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 22) [Matrix Multiply] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 23) [Constant] + (Unnamed Layer* 24) [Shuffle] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 25) [ElementWise] + Relu_22 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 28) [Constant] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 29) [Matrix Multiply] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 30) [Constant] + (Unnamed Layer* 31) [Shuffle] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 32) [ElementWise] + Relu_24 input reformatter 0 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 32) [ElementWise] + Relu_24 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 35) [Constant] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 36) [Matrix Multiply] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 37) [Constant] + (Unnamed Layer* 38) [Shuffle] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 39) [ElementWise] + Relu_26 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Layer: (Unnamed Layer* 42) [Softmax] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TRT] Total Host Persistent Memory: 6032 [TRT] Total Device Persistent Memory: 7550464 [TRT] Total Weight Memory: 0 [TRT] Builder timing cache: created 90 entries, 36 hit(s) [TRT] Engine generation completed in 24.0256 seconds. [TRT] Engine Layer Information: [TRT] Layer(Reformat): Conv_0 + Relu_1 input reformatter 0, Tactic: 0, input.1[Float(3,150,150)] -> Conv_0 + Relu_1 reformatted input 0[Half(3,150,150)] [TRT] Layer(hcudnn): Conv_0 + Relu_1, Tactic: 5319956359050645452, Conv_0 + Relu_1 reformatted input 0[Half(3,150,150)] -> 50[Half(32,152,152)] [TRT] Layer(PoolingTiled): MaxPool_2, Tactic: 5964033, 50[Half(32,152,152)] -> 51[Half(32,76,76)] [TRT] Layer(Scale): BatchNormalization_3, Tactic: 0, 51[Half(32,76,76)] -> 52[Half(32,76,76)] [TRT] Layer(hcudnn_winograd): Conv_4 + Relu_5, Tactic: 4772821744921268633, 52[Half(32,76,76)] -> 54[Half(64,78,78)] [TRT] Layer(PoolingTiled): MaxPool_6, Tactic: 6357249, 54[Half(64,78,78)] -> 55[Half(64,39,39)] [TRT] Layer(Scale): BatchNormalization_7, Tactic: 0, 55[Half(64,39,39)] -> 56[Half(64,39,39)] [TRT] Layer(hcudnn): Conv_8 + Relu_9, Tactic: -4212163711445252890, 56[Half(64,39,39)] -> 58[Half(128,41,41)] [TRT] Layer(PoolingTiled): MaxPool_10, Tactic: 5964033, 58[Half(128,41,41)] -> 59[Half(128,20,20)] [TRT] Layer(Scale): BatchNormalization_11, Tactic: 0, 59[Half(128,20,20)] -> 60[Half(128,20,20)] [TRT] Layer(hcudnn): Conv_12 + Relu_13, Tactic: -4212163711445252890, 60[Half(128,20,20)] -> 62[Half(256,22,22)] [TRT] Layer(PoolingTiled): MaxPool_14, Tactic: 6488321, 62[Half(256,22,22)] -> 63[Half(256,11,11)] [TRT] Layer(Scale): BatchNormalization_15, Tactic: 0, 63[Half(256,11,11)] -> 64[Half(256,11,11)] [TRT] Layer(hcudnn_winograd): Conv_16 + Relu_17, Tactic: 4772821744921268633, 64[Half(256,11,11)] -> 66[Half(512,13,13)] [TRT] Layer(Pooling): MaxPool_18, Tactic: -3, 66[Half(512,13,13)] -> 67[Half(512,6,6)] [TRT] Layer(Reformat): Reshape_20 input reformatter 0, Tactic: 0, 67[Half(512,6,6)] -> Reshape_20 reformatted input 0[Half(512,6,6)] [TRT] Layer(Constant): (Unnamed Layer* 21) [Constant], Tactic: 0, -> (Unnamed Layer* 21) [Constant]_output[Half(512)] [TRT] Layer(MatrixMultiply): (Unnamed Layer* 22) [Matrix Multiply], Tactic: 0, 69[Half(18432)], (Unnamed Layer* 21) [Constant]_output[Half(512)] -> (Unnamed Layer* 22) [Matrix Multiply]_output[Half(512)] [TRT] Layer(Constant): (Unnamed Layer* 23) [Constant] + (Unnamed Layer* 24) [Shuffle], Tactic: 0, -> (Unnamed Layer* 24) [Shuffle]_output[Half(512)] [TRT] Layer(ElementWise): (Unnamed Layer* 25) [ElementWise] + Relu_22, Tactic: 1, (Unnamed Layer* 22) [Matrix Multiply]_output[Half(512)], (Unnamed Layer* 24) [Shuffle]_output[Half(512)] -> 71[Half(512)] [TRT] Layer(Constant): (Unnamed Layer* 28) [Constant], Tactic: 0, -> (Unnamed Layer* 28) [Constant]_output[Half(100)] [TRT] Layer(MatrixMultiply): (Unnamed Layer* 29) [Matrix Multiply], Tactic: 0, 71[Half(512)], (Unnamed Layer* 28) [Constant]_output[Half(100)] -> (Unnamed Layer* 29) [Matrix Multiply]_output[Half(100)] [TRT] Layer(Constant): (Unnamed Layer* 30) [Constant] + (Unnamed Layer* 31) [Shuffle], Tactic: 0, -> (Unnamed Layer* 31) [Shuffle]_output[Float(100)] [TRT] Layer(Reformat): (Unnamed Layer* 32) [ElementWise] + Relu_24 input reformatter 0, Tactic: 0, (Unnamed Layer* 29) [Matrix Multiply]_output[Half(100)] -> (Unnamed Layer* 32) [ElementWise] + Relu_24 reformatted input 0[Float(100)] [TRT] Layer(ElementWise): (Unnamed Layer* 32) [ElementWise] + Relu_24, Tactic: 1, (Unnamed Layer* 32) [ElementWise] + Relu_24 reformatted input 0[Float(100)], (Unnamed Layer* 31) [Shuffle]_output[Float(100)] -> 73[Float(100)] [TRT] Layer(Constant): (Unnamed Layer* 35) [Constant], Tactic: 0, -> (Unnamed Layer* 35) [Constant]_output[Float(2)] [TRT] Layer(MatrixMultiply): (Unnamed Layer* 36) [Matrix Multiply], Tactic: 0, 73[Float(100)], (Unnamed Layer* 35) [Constant]_output[Float(2)] -> (Unnamed Layer* 36) [Matrix Multiply]_output[Float(2)] [TRT] Layer(Constant): (Unnamed Layer* 37) [Constant] + (Unnamed Layer* 38) [Shuffle], Tactic: 0, -> (Unnamed Layer* 38) [Shuffle]_output[Float(2)] [TRT] Layer(ElementWise): (Unnamed Layer* 39) [ElementWise] + Relu_26, Tactic: 1, (Unnamed Layer* 36) [Matrix Multiply]_output[Float(2)], (Unnamed Layer* 38) [Shuffle]_output[Float(2)] -> 75[Float(2)] [TRT] Layer(ExtSoftMax): (Unnamed Layer* 42) [Softmax], Tactic: 0, 75[Float(2)] -> 76[Float(2)] [TRT] device GPU, completed building CUDA engine [TRT] network profiling complete, writing engine cache to networks/h_classifier/h_classifier.onnx.1.1.GPU.FP16.engine [TRT] device GPU, completed writing engine cache to networks/h_classifier/h_classifier.onnx.1.1.GPU.FP16.engine [TRT] device GPU, networks/h_classifier/h_classifier.onnx loaded [TRT] Deserialize required 434160 microseconds. [TRT] device GPU, CUDA engine context initialized with 2 bindings [TRT] binding -- index 0 -- name 'input.1' -- type FP32 -- in/out INPUT -- # dims 4 -- dim #0 1 (SPATIAL) -- dim #1 3 (SPATIAL) -- dim #2 150 (SPATIAL) -- dim #3 150 (SPATIAL) [TRT] binding -- index 1 -- name '76' -- type FP32 -- in/out OUTPUT -- # dims 2 -- dim #0 1 (SPATIAL) -- dim #1 2 (SPATIAL) [TRT] INVALID_ARGUMENT: Cannot find binding of given name: data [TRT] binding to input 0 data binding index: -1 [TRT] Parameter check failed at: engine.cpp::getBindingDimensions::1977, condition: bindIndex >= 0 && bindIndex < getNbBindings() [TRT] binding to input 0 data dims (b=1 c=1 h=1 w=1) size=4 [TRT] INVALID_ARGUMENT: Cannot find binding of given name: prob [TRT] binding to output 0 prob binding index: -1 [TRT] Parameter check failed at: engine.cpp::getBindingDimensions::1977, condition: bindIndex >= 0 && bindIndex < getNbBindings() [TRT] binding to output 0 prob dims (b=1 c=1 h=1 w=1) size=4 device GPU, networks/h_classifier/h_classifier.onnx initialized. [TRT] networks/h_classifier/h_classifier.onnx loaded imageNet -- loaded 1 class info entries networks/h_classifier/h_classifier.onnx initialized.