Hi, I’m currently running cuSolver on 28x28 matrices using gesvdj and syevj for comparison. For some reason, I start to get incorrect values using gesvdj when I use batch size of >128. It also happens with smaller matrix sizes
Is there any reason why this might happen?
I have to think this is for sure below the allowed memory limit. Code is linked (based on an example I found here). I am running sm_87 on the jetson Orin. example.zip (11.9 KB)
I’ve run your test case (512 matrices) on both CUDA 11.4 and CUDA 12.0. I see slightly different behavior in terms of data output, but there is output data for the gesvdj case that is zeros. Furthermore, running your code under compute-sanitizer shows invalid writes in one of the cusolver kernels.