Hi all,
I recently upgraded to Cuda 10.1, and I can’t compute general SVD of tall matrices anymore, that is cusolverDnSgesvd with large m, tiny n. After some inspection, I noticed that cusolverDnSgesvd_bufferSize is quadratic, but I believe the memory needed to compute SVD should be linear. Here is the plot of values in log scale: sgesvd_bufferSize - Imgur I actually get an overflow for m larger than ~32k, which is not that tall.
Is there any known workaround?