When will a op with 2:4 Sparsity Weight call sparse_conv kernel on TensorRT?

Hi, guys

I have a model with 91 sparsity ops(totally130 ops) generating via ASP Tool. Then convert sparsity onnx model to trt model, it just have 43 ops to call "…sparse_conv… kernel " to calculate. The trtexec tool doesn’t convert all sparsity ops to trt sparse kernel. The hardware is RTX3080.

The sparse kernel is like “sm80_xmma_fprop_sparse_conv_f16f16_f16f16_f16_nhwckrsc_nhwc_tilesize128x128x64_stage3_warpsize2x2x1_g1_sptensor16x8x32_t1r1s1_execute_kernel_trt”.
The dense kernel is like “trt_ampere_h1688cudnn_256x64_ldg8_relu_exp_small_nhwc_tn_v1” or “sm80_xmma_fprop_implicit_gemm_f16f16_f16f16_f16_nhwckrsc_nhwc_tilesize64x32x64_stage5_warpsize2x2x1_g1_tensor16x8x16_simple_t1r1s1_execute_kernel_trt”

Please correct me if i’m wrong.

The Sparse Conv2d kernel:

The Dense Conv2d kernel:

So what conditions does the ops meet, it would call sparse kernel, thx

Hi, @spolisetty any comments on this topic?

Hi,

Usually, a dense conv kernel is faster than a sparse conv kernel for that specific conv, so TRT chooses the dense kernel.
Sparse kernels are only faster than dense kernels when the problem size is large enough. That means the channels (C and K) are greater than 256.

Thank you.