Is it possible to fuse gather-gemm-scatter using cuDNN?

My program gather the input first, then perform gemm, and scatter the gemm result to get the output. I want to fuse these three kernels, and find cutlass provide such fusion. Is it possible to implement such fusion using cuDNN?

Thanks for the request! This is not supported today, but it’s under consideration for our future roadmap.

Do you have any context that you’d be able to share about your use case? E.g. what workload, what framework, etc?

1 Like