Hi Nvidia Team,
Actually, I am working on registering a Plugin for an Operator(Einsum) which is not currently supported in TensorRT. So, instead of implementing a CUDA Kernel, I want to use the CuBLAS Library for Batch Matrix Multiplication.
The Equations I want to implement is(from Einsum Operator):
"ntg, ncg → nct" and " nct, ncp-> ntp" (for Batch Matrix Multiplication)
Info about Einsum op: onnx/Operators.md at master · onnx/onnx · GitHub
I needed a guidance in using CuBLAS Library for Batched Matrix Multiplication for the above two Ops.
I am referring to the Available references(cuBLAS :: CUDA Toolkit Documentation, Pro Tip: cuBLAS Strided Batched Matrix Multiply | NVIDIA Developer Blog), but I am not getting how to use it for the above Equations.
Can you please assist me for the same?
Thanks in Advance,
Darshan C G