Hi, guys,
I found that a Large Kernel DepthWise convolutions (LKDWconvs) with torch.nn.Conv2d (pytorch/issues/85252) is not as fast as the counterpart of megengine
.
It is guessed that this might be due to that torch.nn.Conv2d
is based on cuDNN, while megengine
adopts a self-developed CUDA operator specially optimized for a LKDWconv.
Your answer will be appreciated!