I am using cuSolver to solve 3104 AX=B matrices while A is 186x186 and B is a vector with length 186.
I would like to just use a batch function but I dont see any batch functions for cuSolverDN like there is for cufft.
Instead I am trying to launch a kernel that will use cusolverDnSpotrs() on each thread. But I am getting an error saying “calling a host function(“cusolverDnSpotrs”) from a global function(“kernelCall”) is not allowed”
I have done this before with cublas and it works wonderfully. Even in the documentation of cublas it says “the recommended programming model is to create one CUBLAS handle per thread and use that CUBLAS handle for the entire life of the thread”
I am trying to create a cuSolver handle per thread but it gives me the same error as above “calling a host function(“cusolverDnCreate”) from a global function(“kernelCall”) is not allowed”
I really don’t want to have a 3104 long loop on the host that runs the solver for each matrix solution. I understand that I could use streams and stuff but there has got to be a better way.
Could someone point me in the right direction?