[12/20/2021-23:14:28] [TRT] [I] [MemUsageChange] Init CUDA: CPU -4027752001, GPU +0, now: CPU 302, GPU 244 (MiB) [12/20/2021-23:14:28] [TRT] [I] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 306 MiB, GPU 244 MiB [12/20/2021-23:14:28] [TRT] [I] [MemUsageSnapshot] End constructing builder kernel library: CPU 399 MiB, GPU 278 MiB [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::GridAnchor_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::GridAnchorRect_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::NMS_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::Reorg_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::Region_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::Clip_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::LReLU_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::PriorBox_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::Normalize_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::ScatterND version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::RPROI_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::BatchedNMS_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::FlattenConcat_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::CropAndResize version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::DetectionLayer_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::EfficientNMS_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::EfficientNMS_TFTRT_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::Proposal version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::ProposalLayer_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::PyramidROIAlign_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::ResizeNearest_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::Split version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::SpecialSlice_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Registered plugin creator - ::InstanceNormalization_TRT version 1 [12/20/2021-23:14:28] [TRT] [V] Adding network input: data with dtype: float32, dimensions: (10, 64, 1, 16) [12/20/2021-23:14:28] [TRT] [V] Registering tensor: data for ONNX tensor: data [12/20/2021-23:14:28] [TRT] [V] Importing initializer: 145 [12/20/2021-23:14:28] [TRT] [W] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [12/20/2021-23:14:28] [TRT] [V] Importing initializer: 146 [12/20/2021-23:14:28] [TRT] [V] Importing initializer: 187 [12/20/2021-23:14:28] [TRT] [V] Importing initializer: 188 [12/20/2021-23:14:28] [TRT] [V] Importing initializer: 189 [12/20/2021-23:14:28] [TRT] [V] Parsing node: Transpose_0 [Transpose] [12/20/2021-23:14:28] [TRT] [V] Searching for input: data [12/20/2021-23:14:28] [TRT] [V] Transpose_0 [Transpose] inputs: [data -> (10, 64, 1, 16)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: Transpose_0 for ONNX node: Transpose_0 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 9 for ONNX tensor: 9 [12/20/2021-23:14:28] [TRT] [V] Transpose_0 [Transpose] outputs: [9 -> (10, 16, 64, 1)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Squeeze_1 [Squeeze] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 9 [12/20/2021-23:14:28] [TRT] [V] Squeeze_1 [Squeeze] inputs: [9 -> (10, 16, 64, 1)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Original shape: (10, 16, 64, 1), squeezing to: (10, 16, 64) [12/20/2021-23:14:28] [TRT] [V] Registering layer: Squeeze_1 for ONNX node: Squeeze_1 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 10 for ONNX tensor: 10 [12/20/2021-23:14:28] [TRT] [V] Squeeze_1 [Squeeze] outputs: [10 -> (10, 16, 64)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Shape_2 [Shape] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 10 [12/20/2021-23:14:28] [TRT] [V] Shape_2 [Shape] inputs: [10 -> (10, 16, 64)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: Shape_2 for ONNX node: Shape_2 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 11 for ONNX tensor: 11 [12/20/2021-23:14:28] [TRT] [V] Shape_2 [Shape] outputs: [11 -> (3)[INT32]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Constant_3 [Constant] [12/20/2021-23:14:28] [TRT] [V] Constant_3 [Constant] inputs: [12/20/2021-23:14:28] [TRT] [V] Constant_3 [Constant] outputs: [12 -> ()[INT32]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Gather_4 [Gather] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 11 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 12 [12/20/2021-23:14:28] [TRT] [V] Gather_4 [Gather] inputs: [11 -> (3)[INT32]], [12 -> ()[INT32]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: 12 for ONNX node: 12 [12/20/2021-23:14:28] [TRT] [V] Using Gather axis: 0 [12/20/2021-23:14:28] [TRT] [V] Registering layer: Gather_4 for ONNX node: Gather_4 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 13 for ONNX tensor: 13 [12/20/2021-23:14:28] [TRT] [V] Gather_4 [Gather] outputs: [13 -> ()[INT32]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Unsqueeze_5 [Unsqueeze] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 13 [12/20/2021-23:14:28] [TRT] [V] Unsqueeze_5 [Unsqueeze] inputs: [13 -> ()[INT32]], [12/20/2021-23:14:28] [TRT] [V] Original shape: (), unsqueezing to: (1,) [12/20/2021-23:14:28] [TRT] [V] Registering layer: Unsqueeze_5 for ONNX node: Unsqueeze_5 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 17 for ONNX tensor: 17 [12/20/2021-23:14:28] [TRT] [V] Unsqueeze_5 [Unsqueeze] outputs: [17 -> (1)[INT32]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Concat_6 [Concat] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 145 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 17 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 146 [12/20/2021-23:14:28] [TRT] [V] Concat_6 [Concat] inputs: [145 -> (1)[INT32]], [17 -> (1)[INT32]], [146 -> (1)[INT32]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: 145 for ONNX node: 145 [12/20/2021-23:14:28] [TRT] [V] Registering layer: 146 for ONNX node: 146 [12/20/2021-23:14:28] [TRT] [V] Registering layer: Concat_6 for ONNX node: Concat_6 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 19 for ONNX tensor: 19 [12/20/2021-23:14:28] [TRT] [V] Concat_6 [Concat] outputs: [19 -> (3)[INT32]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: ConstantOfShape_7 [ConstantOfShape] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 19 [12/20/2021-23:14:28] [TRT] [V] ConstantOfShape_7 [ConstantOfShape] inputs: [19 -> (3)[INT32]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: ConstantOfShape_7 for ONNX node: ConstantOfShape_7 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 20 for ONNX tensor: 20 [12/20/2021-23:14:28] [TRT] [V] ConstantOfShape_7 [ConstantOfShape] outputs: [20 -> (2, 10, 32)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Transpose_8 [Transpose] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 10 [12/20/2021-23:14:28] [TRT] [V] Transpose_8 [Transpose] inputs: [10 -> (10, 16, 64)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: Transpose_8 for ONNX node: Transpose_8 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 21 for ONNX tensor: 21 [12/20/2021-23:14:28] [TRT] [V] Transpose_8 [Transpose] outputs: [21 -> (16, 10, 64)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: LSTM_9 [LSTM] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 21 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 188 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 189 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 187 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 20 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 20 [12/20/2021-23:14:28] [TRT] [V] LSTM_9 [LSTM] inputs: [21 -> (16, 10, 64)[FLOAT]], [188 -> (2, 128, 64)[FLOAT]], [189 -> (2, 128, 32)[FLOAT]], [187 -> (2, 256)[FLOAT]], [optional input, not set], [20 -> (2, 10, 32)[FLOAT]], [20 -> (2, 10, 32)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: 188 for ONNX node: 188 [12/20/2021-23:14:28] [TRT] [V] Registering layer: 189 for ONNX node: 189 [12/20/2021-23:14:28] [TRT] [V] Registering layer: 187 for ONNX node: 187 [12/20/2021-23:14:28] [TRT] [V] Bias shape is: (2, 256) [12/20/2021-23:14:28] [TRT] [V] Reshaping bias to: (2, 2, 128) [12/20/2021-23:14:28] [TRT] [V] After reduction, bias shape is: (2, 1, 128) [12/20/2021-23:14:28] [TRT] [V] numDirectionsTensor shape: (1) [12/20/2021-23:14:28] [TRT] [V] hiddenSizeTensor shape: (1) [12/20/2021-23:14:28] [TRT] [V] batchSizeTensor shape: (1) [12/20/2021-23:14:28] [TRT] [V] Gate output rank (equal to initial hidden/cell state rank): (3) [12/20/2021-23:14:28] [TRT] [V] Initial hidden state shape: (2, 10, 32) [12/20/2021-23:14:28] [TRT] [V] Initial cell state shape: (2, 10, 32) [12/20/2021-23:14:28] [TRT] [V] Entering Loop [12/20/2021-23:14:28] [TRT] [V] Original shape: (10, 64), unsqueezing to: (1, 10, 64) [12/20/2021-23:14:28] [TRT] [V] Original shape: (10, 64), unsqueezing to: (1, 10, 64) [12/20/2021-23:14:28] [TRT] [V] Input shape: (2, 10, 64) [12/20/2021-23:14:28] [TRT] [V] Registering layer: LSTM_9 for ONNX node: LSTM_9 [12/20/2021-23:14:28] [TRT] [V] Hidden state shape: (2, 10, 32) [12/20/2021-23:14:28] [TRT] [V] Cell state shape: (2, 10, 32) [12/20/2021-23:14:28] [TRT] [V] X(t) * W^T -> (2, 10, 128) [12/20/2021-23:14:28] [TRT] [V] H(t-1) * R^T -> (2, 10, 128) [12/20/2021-23:14:28] [TRT] [V] intermediate(t) -> (2, 10, 128) [12/20/2021-23:14:28] [TRT] [V] c(t) -> (2, 10, 32) [12/20/2021-23:14:28] [TRT] [V] C(t) -> (2, 10, 32) [12/20/2021-23:14:28] [TRT] [V] H(t) -> (2, 10, 32) [12/20/2021-23:14:28] [TRT] [V] Concatenated output shape: (2, 10, 32) [12/20/2021-23:14:28] [TRT] [V] Forward pass shape: (0, 0, 0) [12/20/2021-23:14:28] [TRT] [V] Reverse pass shape: (0, 0, 0) [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 138 for ONNX tensor: 138 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 139 for ONNX tensor: 139 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 140 for ONNX tensor: 140 [12/20/2021-23:14:28] [TRT] [V] LSTM_9 [LSTM] outputs: [138 -> (16, 2, 10, 32)[FLOAT]], [139 -> (2, 10, 32)[FLOAT]], [140 -> (2, 10, 32)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Transpose_10 [Transpose] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 138 [12/20/2021-23:14:28] [TRT] [V] Transpose_10 [Transpose] inputs: [138 -> (16, 2, 10, 32)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: Transpose_10 for ONNX node: Transpose_10 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 141 for ONNX tensor: 141 [12/20/2021-23:14:28] [TRT] [V] Transpose_10 [Transpose] outputs: [141 -> (16, 10, 2, 32)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Constant_11 [Constant] [12/20/2021-23:14:28] [TRT] [V] Constant_11 [Constant] inputs: [12/20/2021-23:14:28] [TRT] [V] Constant_11 [Constant] outputs: [142 -> (3)[INT32]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Reshape_12 [Reshape] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 141 [12/20/2021-23:14:28] [TRT] [V] Searching for input: 142 [12/20/2021-23:14:28] [TRT] [V] Reshape_12 [Reshape] inputs: [141 -> (16, 10, 2, 32)[FLOAT]], [142 -> (3)[INT32]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: Reshape_12 for ONNX node: Reshape_12 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 143 for ONNX tensor: 143 [12/20/2021-23:14:28] [TRT] [V] Reshape_12 [Reshape] outputs: [143 -> (16, 10, 64)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Parsing node: Transpose_13 [Transpose] [12/20/2021-23:14:28] [TRT] [V] Searching for input: 143 [12/20/2021-23:14:28] [TRT] [V] Transpose_13 [Transpose] inputs: [143 -> (16, 10, 64)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Registering layer: Transpose_13 for ONNX node: Transpose_13 [12/20/2021-23:14:28] [TRT] [V] Registering tensor: 144_0 for ONNX tensor: 144 [12/20/2021-23:14:28] [TRT] [V] Transpose_13 [Transpose] outputs: [144 -> (10, 16, 64)[FLOAT]], [12/20/2021-23:14:28] [TRT] [V] Marking 144_0 as output: 144 [12/20/2021-23:14:28] [TRT] [V] Applying generic optimizations to the graph for inference. [12/20/2021-23:14:28] [TRT] [V] Original: 46 layers [12/20/2021-23:14:28] [TRT] [V] After dead-layer removal: 46 layers [12/20/2021-23:14:28] [TRT] [V] Running: ConstShuffleFusion [12/20/2021-23:14:28] [TRT] [V] ConstShuffleFusion: Fusing (Unnamed Layer* 9) [Constant] with (Unnamed Layer* 10) [Shuffle] [12/20/2021-23:14:28] [TRT] [V] Running: ConstShuffleFusion [12/20/2021-23:14:28] [TRT] [V] ConstShuffleFusion: Fusing 187 with (Unnamed Layer* 17) [Shuffle] [12/20/2021-23:14:28] [TRT] [V] Running: ShuffleShuffleFusion [12/20/2021-23:14:28] [TRT] [V] ShuffleShuffleFusion: Fusing Transpose_0 with Squeeze_1 [12/20/2021-23:14:28] [TRT] [V] Running: ShuffleShuffleFusion [12/20/2021-23:14:28] [TRT] [V] ShuffleShuffleFusion: Fusing Transpose_0 + Squeeze_1 with Transpose_8 [12/20/2021-23:14:28] [TRT] [V] Running: ShuffleShuffleFusion [12/20/2021-23:14:28] [TRT] [V] ShuffleShuffleFusion: Fusing Transpose_10 with Reshape_12 [12/20/2021-23:14:28] [TRT] [V] Running: ShuffleShuffleFusion [12/20/2021-23:14:28] [TRT] [V] ShuffleShuffleFusion: Fusing Transpose_10 + Reshape_12 with Transpose_13 [12/20/2021-23:14:28] [TRT] [V] After Myelin optimization: 1 layers [12/20/2021-23:14:28] [TRT] [V] Applying ScaleNodes fusions. [12/20/2021-23:14:28] [TRT] [V] After scale fusion: 1 layers [12/20/2021-23:14:28] [TRT] [V] After vertical fusions: 1 layers [12/20/2021-23:14:28] [TRT] [V] After dupe layer removal: 1 layers [12/20/2021-23:14:28] [TRT] [V] After final dead-layer removal: 1 layers [12/20/2021-23:14:28] [TRT] [V] After tensor merging: 1 layers [12/20/2021-23:14:28] [TRT] [V] After concat removal: 1 layers [12/20/2021-23:14:28] [TRT] [V] Graph construction and optimization completed in 0.00200262 seconds. [12/20/2021-23:14:29] [TRT] [V] Using cublasLt as a tactic source [12/20/2021-23:14:29] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU -4027753276, GPU +70, now: CPU 572, GPU 348 (MiB) [12/20/2021-23:14:29] [TRT] [V] Using cuDNN as a tactic source [12/20/2021-23:14:29] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +136, GPU +88, now: CPU 708, GPU 436 (MiB) [12/20/2021-23:14:29] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored. [12/20/2021-23:14:29] [TRT] [V] Constructing optimization profile number 0 [1/1]. [12/20/2021-23:14:29] [TRT] [V] Reserving memory for activation tensors. Host: 0 bytes Device: 81920 bytes [12/20/2021-23:14:29] [TRT] [V] =============== Computing reformatting costs [12/20/2021-23:14:29] [TRT] [V] =============== Computing reformatting costs [12/20/2021-23:14:29] [TRT] [V] =============== Computing costs for [12/20/2021-23:14:29] [TRT] [V] *************** Autotuning format combination: Float(1024,16,16,1) -> Float(1024,64,1) *************** [12/20/2021-23:14:29] [TRT] [V] --------------- Timing Runner: {ForeignNode[(Unnamed Layer* 61) [LoopOutput][length][Constant]...Transpose_10 + Reshape_12 + Transpose_13]} (Myelin) python: /root/gpgpu/MachineLearning/myelin/src/compiler/optimizer/formats.cpp:3052: bool myelin::ir::no_data_move(const myelin::tensor_descriptor_t*, const std::vector&): Assertion `perm[i] >= 0 && perm[i] < (int) out->get_const_dimensions().size()' failed.