Hi, I am reviewing the bached Cholesky Examples (E.1 bached Cholesky Fracorization) in cuSOLVER. And I am little confused about part of code.
The code blow declare 2 host arrays A0 and A1 along with a *Aarray. I assume *Aarray is located in the device since it is malloced using cudaMalloc.
double A0[lda*m] = { 1.0, 2.0, 3.0, 2.0, 5.0, 5.0, 3.0, 5.0, 12.0 };
double A1[lda*m] = { 1.0, 2.0, 3.0, 2.0, 4.0, 5.0, 3.0, 5.0, 12.0 };
double *Aarray[batchSize];
for(int j = 0 ; j < batchSize ; j++){
cudaStat1 = cudaMalloc ((void**)&Aarray[j], sizeof(double) * lda * m);
}
cudaStat1 = cudaMemcpy(Aarray[0], A0, sizeof(double) * lda * m, cudaMemcpyHostToDevice);
cudaStat2 = cudaMemcpy(Aarray[1], A1, sizeof(double) * lda * m, cudaMemcpyHostToDevice);
The next part is where it confuse me, the d_Aarray is copied from Aarray using cudaMemcpyHostToDevice flag which means Aarray is located in the host, which is contradicted to the content above.
cudaStat1 = cudaMemcpy(d_Aarray, Aarray, sizeof(double*)*batchSize, cudaMemcpyHostToDevice);
Could someone tells me where is the disconnect here? Very appreciated!