Hi, I met a problem when I did a matrix multiplication in CUBLAS.

I need to reshape a complex matrix from [rowsA, colsA] to [rowsA*colsA,1] which means to reshape it to a linear vector. This complex matrix is already on device which is got from a kernel function. I know cublasSetMatrix can do the reshape when setting up the matrix, but this only works when copying memory from CPU to GPU. I don’t want to copy that matrix to CPU first and back to GPU by using cublasSetMatrix. I think I should look for a way to do the reshape on device.

I used this function cudaMemcpy2D just like this:

```
cudaMemcpy2D(d_A2, RowsA*ColsA*sizeof(cuComplex), d_C, RowsA*sizeof(cuComplex), RowsA*sizeof(cuComplex), ColsA, cudaMemcpyDeviceToDevice);
```

But the result is wrong.

Is anyone knows how to solve this problem?? Thank you in advance.