Problem with finding cause for HostInfo being -1 for cuSolver function

Hi!

I am trying to use the cusolverDnDgetrf function to use the cuSolver library for LU factorization but I am running into an issue with the devInfo parameter.

The devInfo gets a value of -1 which according to the documentation means that the 1st parameter (excluding handle) is wrong.

The first parameter is matrix_dim but I cannot figure out why this is wrong.

My matrix_dim is an integer with the value 4 which gives a 4x4 matrix

m is a one-dimensional array with allocated size matrix_dimmatrix_dimsizeof(double)

The values of the matrix are:

10.000000 9.990005 9.980020 9.970045
9.990005 10.000000 9.990005 9.980020
9.980020 9.990005 10.000000 9.990005
9.970045 9.980020 9.990005 10.000000

Here is the function where I try to use the cuSolver function:

This is the output i get:

cusolverDnDgetrf failed. Status: 0, devInfo: -1

void lud_cuda(double *m, int matrix_dim, int choice)
{
  int i=0;
  dim3 dimBlock(BLOCK_SIZE, BLOCK_SIZE);
  double *m_debug = (double*)malloc(matrix_dim*matrix_dim*sizeof(double));

  if(choice == 1)
  {
    printf("Using library...\n");
    cusolverDnHandle_t handle;
    cusolverDnCreate(&handle);

      int* devIpiv;
      cudaMalloc((void**)&devIpiv, matrix_dim * sizeof(int));

      int workspace_size;
      cusolverDnDgetrf_bufferSize(handle, matrix_dim, matrix_dim, m, matrix_dim, &workspace_size);

      double* devWorkspace;
      cudaMalloc((void**)&devWorkspace, workspace_size);

      int* devInfo;
      cudaMalloc((void**)&devInfo, sizeof(int));

    cusolverStatus_t status = cusolverDnDgetrf(handle, matrix_dim, matrix_dim, m, matrix_dim, devWorkspace, devIpiv, devInfo);
    cudaDeviceSynchronize();

    int* hostInfo = (int*)malloc(sizeof(int));
    cudaMemcpy(hostInfo, devInfo, sizeof(int), cudaMemcpyDeviceToHost);
    
    if (status != CUSOLVER_STATUS_SUCCESS || *hostInfo != 0) {
      fprintf(stderr, "cusolverDnDgetrf failed. Status: %d, devInfo: %d\n", status, *hostInfo);
    } else {
      printf("cusolverDnDgetrf succeeded\n");
    }

    cudaFree(devIpiv);
    cudaFree(devWorkspace);

    cusolverDnDestroy(handle);