I’m seeking a knowledgeable individual to assist me in accelerating my code (Matrix multiplication), which is written in C++ using the Eigen library, by leveraging BLAS on an NVIDIA GPU.
System Requirements:
- Platform: Intel/oneAPI
- Version: Basekit 2023.0.0-devel
- Operating System: Ubuntu 20.04
This means I want to use BLAS with CUDA to accelerate via the GPU, without modifying my C++ code that is written in Eigen. I aimed to implement it using CMake with the link and Drop-in method. However, I’m struggling with it. Can anyone help?