I need to operate on parts of a matrix, performing an SVD on each of them. can i call cusolverDnDgesvd (the truncated version) from each kernel pointing to the part of the matrix where execute the calculation?
no, cusolver is a host-only API at this time.
if i call cusolver from host, can I parallelise in some way kernels that act on different parts of the matrix?
To be clear, I don’t want call n kernel (in a loop), with n parts of the matrix to compute.