Some question about the sample matrixMulCUBLAS

hi,I’m trying to do some test about huge matrix multiply ( like (38400,64)x(64,38400) )
I changed the param in the sample like:
matrix_size.uiHA = 240 * block_size * iSizeMultiple;
matrix_size.uiWA = 4 * block_size * iSizeMultiple/10;
matrix_size.uiHB = 4 * block_size * iSizeMultiple/10;
matrix_size.uiWB = 240 * block_size * iSizeMultiple;
matrix_size.uiHC = 240 * block_size * iSizeMultiple;
matrix_size.uiWC = 240 * block_size * iSizeMultiple;
but it will get the error matrixMulCUBLAS.cpp:312 code=77(cudaErrorIllegalAddress) “cudaEventSynchronize(stop)”
while if the uiHA and uiWB is less than 30000 there will be no errors
why is this happened? which knowledge should I learn to avoid this kind of mistakes.


Could you share how are you calling the GEMM function? A very basic piece of code which reproduces your error would be very helpful in this case.

On the other hand, have you have a look of the example code:

I am a bit confused why you are setting the matrix size in that fashion, since the cuBLAS deploys the threads accordingly to the matrix size. Giving more details will ease the way to help you.