Hello,
I am recently working on a numerical acoustics project with traditional boundary element method. The solution to the problem can be divided in the following two parts:

Compute a matrix A. This will be a large dense complex matrix with size around 10000*10000. It is not guaranteed that this is a square matrix, because for some special frequencies, additional points will be added. Every element of this matrix has to be independently computed from geometry of an object, but since the data for each element is independent from that of the others, GPU can be used to facilitate this process.

Solve a least square problem of minAxb. Since the matrix A has a size that is way much bigger than what can be allocated on GPU, CUDA provided libraries(cuSolver especially) are not applicable, or at least not efficient. What I plan to do is to solve this problem independently on host because the size of this problem can be handled by virtual memory. However, I don’t know if there is any BLAS library that can be used in compatibility with CUDA. I tried EIGEN, but the build failed.
In my mind, using CUDA must not necessarily mean you will have to abandon all host based libraries. But does anyone have experience with using a host library with CUDA together? Thank you for reading this. I appreciate your interest in this problem.
Best,
Ziqi