Hi everybody,

I implemented a very simple eigensolver in CUDA C using the cusolver library and although it works like a charm it has an issue I have not been able to solve. After computing the eigenpairs via *cusolverDnDsyevd* with the option *jobz = CUSOLVER_EIG_MODE_VECTOR*, I copy the eigenvalue vector and eigenvector matrix from device to host by

```
cudaStat1 = cudaMemcpy(h_eigvals, d_eigvals, sizeof(double) * nRows, cudaMemcpyDeviceToHost);
cudaStat2 = cudaMemcpy(h_eigvecs, d_matrix, sizeof(double) * nRows * nCols, cudaMemcpyDeviceToHost);
cudaStat3 = cudaMemcpy(&info_gpu, devInfo, sizeof(int), cudaMemcpyDeviceToHost);
assert(cudaSuccess == cudaStat1);
assert(cudaSuccess == cudaStat2);
assert(cudaSuccess == cudaStat3);
```

where *h_eigvals* is the host eigenvalue vector, *d_eigvals* is the device eigenvalue vector, *h_eigvecs* is the host eigenvector matrix, and *d_matrix* is the device initial matrix, i.e., the matrix whose eigenpairs we are computing. As far as I know *cusolverDnDsyevd* rewrites the original matrix with the eigenvectors. The three *assert* confirm that the memory copy has been successfull. Nevertheless, when I compile and run the code I get a *Segmentation fault (core dumped)* error when trying to print the eigenvector matrix. I have not been able to debug this error.

The previously mentioned arrays are initialized by

```
double **h_matrix;
h_matrix = (double **)malloc(nRows * sizeof(double *));
for (int i = 0; i < nRows; i++) {
h_matrix[i] = (double *)malloc(nCols * sizeof(double));
}
initialData(h_matrix, nRows, nCols);
double *h_eigvals = (double *)malloc(nRows * sizeof(double));
for (int i = 0; i < nRows; i++) {
h_eigvals[i] = 0;
}
double **h_eigvecs;
h_eigvecs = (double **)malloc(nRows * sizeof(double *));
for (int i = 0; i < nRows; i++) {
h_eigvecs[i] = (double *)malloc(nCols * sizeof(double));
}
for (int i = 0; i < nRows; i++) {
for (int j = 0; j < nCols; j++) {
h_eigvecs[i][j] = 0;
}
}
```

Thank you very much in advance.