Seeking Expertise in Accelerating Eigen C++ Code with NVIDIA GPU & BLAS

I’m seeking a knowledgeable individual to assist me in accelerating my code (Matrix multiplication), which is written in C++ using the Eigen library, by leveraging BLAS on an NVIDIA GPU.

System Requirements:

  • Platform: Intel/oneAPI
  • Version: Basekit 2023.0.0-devel
  • Operating System: Ubuntu 20.04

This means I want to use BLAS with CUDA to accelerate via the GPU, without modifying my C++ code that is written in Eigen. I aimed to implement it using CMake with the link and Drop-in method. However, I’m struggling with it. Can anyone help?

Platform: Intel/oneAPI

Not CUDA, not an NVIDIA product. You might want to try the Intel developer forums:

this may be of interest