Hi everybody,
I implemented a very simple eigensolver in CUDA C using the cusolver library and although it works like a charm it has an issue I have not been able to solve. After computing the eigenpairs via cusolverDnDsyevd with the option jobz = CUSOLVER_EIG_MODE_VECTOR, I copy the eigenvalue vector and eigenvector matrix from device to host by
cudaStat1 = cudaMemcpy(h_eigvals, d_eigvals, sizeof(double) * nRows, cudaMemcpyDeviceToHost);
cudaStat2 = cudaMemcpy(h_eigvecs, d_matrix, sizeof(double) * nRows * nCols, cudaMemcpyDeviceToHost);
cudaStat3 = cudaMemcpy(&info_gpu, devInfo, sizeof(int), cudaMemcpyDeviceToHost);
assert(cudaSuccess == cudaStat1);
assert(cudaSuccess == cudaStat2);
assert(cudaSuccess == cudaStat3);
where h_eigvals is the host eigenvalue vector, d_eigvals is the device eigenvalue vector, h_eigvecs is the host eigenvector matrix, and d_matrix is the device initial matrix, i.e., the matrix whose eigenpairs we are computing. As far as I know cusolverDnDsyevd rewrites the original matrix with the eigenvectors. The three assert confirm that the memory copy has been successfull. Nevertheless, when I compile and run the code I get a Segmentation fault (core dumped) error when trying to print the eigenvector matrix. I have not been able to debug this error.
The previously mentioned arrays are initialized by
double **h_matrix;
h_matrix = (double **)malloc(nRows * sizeof(double *));
for (int i = 0; i < nRows; i++) {
h_matrix[i] = (double *)malloc(nCols * sizeof(double));
}
initialData(h_matrix, nRows, nCols);
double *h_eigvals = (double *)malloc(nRows * sizeof(double));
for (int i = 0; i < nRows; i++) {
h_eigvals[i] = 0;
}
double **h_eigvecs;
h_eigvecs = (double **)malloc(nRows * sizeof(double *));
for (int i = 0; i < nRows; i++) {
h_eigvecs[i] = (double *)malloc(nCols * sizeof(double));
}
for (int i = 0; i < nRows; i++) {
for (int j = 0; j < nCols; j++) {
h_eigvecs[i][j] = 0;
}
}
Thank you very much in advance.