Hello,
cudaMallocPitch does not work properly. This is the code segment.
size_t pitch;
A_BD.numBlock=156800; //156800 is a multiple of 64.
cudaMallocPitch((void**)&A, &pitch, A_BD.numBlock*sizeof(float), 9);
However, it does not work correctly. pitch=627200 after the cudaMallocPitch() and memory clashes.
But the following works properly
pitch=A_BD.numBlock;
cudaMalloc((void**)&A, pitch*9*sizeof(float));
It seems that I does not call cudaMallocPitch() correctly. But What’s wrong? Thank you.