hello, nv’s experts
I’m studying cutlass, implicitGEMM for conv2d.
I followed the “fprop_per_channel_bias” in cutlass’s example.
in cutlass/include/cutlass/conv/kernel/default_conv2d_fprop.h, line 171, the codes as following:
static cutlass::arch::CacheOperation::Kind const CacheOpB = ((sizeof_bits<ElementB>::value * AlignmentB) == 128) ? cutlass::arch::CacheOperation::Global : cutlass::arch::CacheOperation::Always; // Define the Mma using Mma = threadblock::ImplicitGemmMultistage< ThreadblockShape, IteratorA, SmemIteratorA, arch::CacheOperation::Always, IteratorB, SmemIteratorB, CacheOpB, MmaPolicy, Stages >;
I found the CacheOperation for A and B is : CacheOperation::Always and CacheOperation::Global
A is activation, and B is filter.
I don’t know why cutlass config the CacheOperation as above?
Is there anyone would like to teach me?