Hi, guys
I have a model with 91 sparsity ops(totally130 ops) generating via ASP Tool. Then convert sparsity onnx model to trt model, it just have 43 ops to call "…sparse_conv… kernel " to calculate. The trtexec tool doesn’t convert all sparsity ops to trt sparse kernel. The hardware is RTX3080.
The sparse kernel is like “sm80_xmma_fprop_sparse_conv_f16f16_f16f16_f16_nhwckrsc_nhwc_tilesize128x128x64_stage3_warpsize2x2x1_g1_sptensor16x8x32_t1r1s1_execute_kernel_trt”.
The dense kernel is like “trt_ampere_h1688cudnn_256x64_ldg8_relu_exp_small_nhwc_tn_v1” or “sm80_xmma_fprop_implicit_gemm_f16f16_f16f16_f16_nhwckrsc_nhwc_tilesize64x32x64_stage5_warpsize2x2x1_g1_tensor16x8x16_simple_t1r1s1_execute_kernel_trt”
Please correct me if i’m wrong.
The Sparse Conv2d kernel:
The Dense Conv2d kernel:
So what conditions does the ops meet, it would call sparse kernel, thx