hello, nv’s experts
I’m studying cutlass, implicitGEMM for conv2d.
I followed the “fprop_per_channel_bias” in cutlass’s example.
in cutlass/include/cutlass/conv/kernel/default_conv2d_fprop.h, line 171, the codes as following:
static cutlass::arch::CacheOperation::Kind const CacheOpB =
((sizeof_bits<ElementB>::value * AlignmentB) == 128)
? cutlass::arch::CacheOperation::Global
: cutlass::arch::CacheOperation::Always;
// Define the Mma
using Mma = threadblock::ImplicitGemmMultistage<
ThreadblockShape,
IteratorA,
SmemIteratorA,
arch::CacheOperation::Always,
IteratorB,
SmemIteratorB,
CacheOpB,
MmaPolicy,
Stages
>;
I found the CacheOperation for A and B is : CacheOperation::Always and CacheOperation::Global
A is activation, and B is filter.
I don’t know why cutlass config the CacheOperation as above?
Is there anyone would like to teach me?