Confusion in bached Cholesky Factorization.

Hi, I am reviewing the bached Cholesky Examples (E.1 bached Cholesky Fracorization) in cuSOLVER. And I am little confused about part of code.

The code blow declare 2 host arrays A0 and A1 along with a *Aarray. I assume *Aarray is located in the device since it is malloced using cudaMalloc.

double A0[lda*m] = { 1.0, 2.0, 3.0, 2.0, 5.0, 5.0, 3.0, 5.0, 12.0 };
    double A1[lda*m] = { 1.0, 2.0, 3.0, 2.0, 4.0, 5.0, 3.0, 5.0, 12.0 };

    double *Aarray[batchSize];

    for(int j = 0 ; j < batchSize ; j++){
        cudaStat1 = cudaMalloc ((void**)&Aarray[j], sizeof(double) * lda * m);
    }

cudaStat1 = cudaMemcpy(Aarray[0], A0, sizeof(double) * lda * m, cudaMemcpyHostToDevice);
    cudaStat2 = cudaMemcpy(Aarray[1], A1, sizeof(double) * lda * m, cudaMemcpyHostToDevice);

The next part is where it confuse me, the d_Aarray is copied from Aarray using cudaMemcpyHostToDevice flag which means Aarray is located in the host, which is contradicted to the content above.

cudaStat1 = cudaMemcpy(d_Aarray, Aarray, sizeof(double*)*batchSize, cudaMemcpyHostToDevice);

Could someone tells me where is the disconnect here? Very appreciated!

I assume the answer here will clear it up:

https://devtalk.nvidia.com/default/topic/1049419/gpu-accelerated-libraries/way-to-covert-pointer-d_a-to-array-d_array-/