Hi Nvidia Team,
Actually, I am working on registering a Plugin for an Operator(Einsum) which is not currently supported in TensorRT. So, instead of implementing a CUDA Kernel, I want to use the CuBLAS Library for Batch Matrix Multiplication.
The Equations I want to implement is(from Einsum Operator):
“ntg, ncg → nct” and " nct, ncp-> ntp"(for Batch Matrix Multiplication)
Info about Einsum op: https://github.com/onnx/onnx/blob/master/docs/Operators.md#Einsum
I needed a guidance in using CuBLAS Library for Batched Matrix Multiplication for the above two Ops.
I am referring to the Available references(cuBLAS :: CUDA Toolkit Documentation, https://developer.nvidia.com/blog/cublas-strided-batched-matrix-multiply/), but I am not getting how to use it for the above Equations.
Can you please assist me for the same?
Thanks in Advance,
Darshan C G