matrixMul problem printDiff is flipping plz help me

jordyvaneijk · October 12, 2007, 8:39am

Hi all,

I have two questions:

1.

I’m using the matrixMul example as benchmark to see the difference in time between the CPU and GPU. To do this i made some changes in the original matrixMul code.

    <b>*</b> The matrices A and B both are initialised at 1 (so every element in the matrix is 1.0f);

    <b>*</b> The matrices are always a square (N*N).

now i have the following problem. When i take a blocksize of 32 and a N of 256 (65536 elements per matrix) the printDiff is telling me that the outcome is not the same between the CPU and GPU.

2.

The second question I have is, when i say the blocksize is 32 then the number of threads will be 1024 or are those independent from each other. Because the maximum number of threads per block can only be 512. and the max of the multiproc. is 768 threads. No it happens to be that when i take a blocksize of 32 and a N of 512 the outcome of CPU and GPU are the same.

Some additional information:

N blocksize

256 32 → false

256 16 → true

512 16 → true

512 32 → true

those are some tested parameters

Systeminformation:

Intel quad Xeon X5355 @ 2.66GHz

2GB ram

Nvidia 8800GTS 320MB

Linux Fedora core 6

CUDA SDK Version 1.0 for Linux

below you can find some of the wrong outcome of the printDiff function:

blocksize: 32 

Number of elements: 256 

Processing time GPU: 2.495000 (ms) 

Name: GeForce 8800 GTS

TotalGlobalMem: 334823424

SharedMemPerBlocks: 16384

RegsPerBlock: 8192

Processing time CPU: 36.185001 (ms) 

Test FAILED ndiff(0,0) CPU=256.000000, GPU=2368.000000 n

diff(1,0) CPU=256.000000, GPU=2368.000000 n

diff(2,0) CPU=256.000000, GPU=2368.000000 n

diff(3,0) CPU=256.000000, GPU=2368.000000 n

diff(4,0) CPU=256.000000, GPU=2368.000000 n

diff(5,0) CPU=256.000000, GPU=2368.000000 n

diff(6,0) CPU=256.000000, GPU=2368.000000 n

diff(7,0) CPU=256.000000, GPU=2368.000000 n

diff(8,0) CPU=256.000000, GPU=2368.000000 n

diff(9,0) CPU=256.000000, GPU=2368.000000 n

..

..

..

diff(253,255) CPU=256.000000, GPU=1.000000 n

diff(254,255) CPU=256.000000, GPU=1.000000 n

diff(255,255) CPU=256.000000, GPU=1.000000 n

 nTotal Errors = 65536 n

Press ENTER to exit...

Can someone please help me.

Thanks,

Jordy

wildcat4096 · October 14, 2007, 4:56pm

If you are still having a problem would you be willing to zip up your code and post it as an attachment?

jordyvaneijk · October 15, 2007, 7:21am

Still have the issue, And here is the code. I hope you have the same problem as I have.

If we use blocksize 32 with any given matrix size we get error if we then use blocksize 16 we get no error. returning to the 32 blocksize again we algo get no errors.
matrixMul.tar.gz (6.26 KB)

Topic		Replies	Views
matrix multiplication CUDA Programming and Performance	10	3968	March 7, 2010
matrix multiplication with large dimensions CUDA Programming and Performance	7	1697	April 9, 2011
Matrix Multiplication Error CUDA Programming and Performance	0	1503	April 16, 2012
Matrix Multiplication Inconsistency Different values output in every run of the matixMul program CUDA Programming and Performance	29	9022	December 16, 2009
Matrix Mult Result is zero! CUDA Programming and Performance	2	1082	July 11, 2010
Large matrix multiplication - output is odd. CUDA Programming and Performance	0	1490	May 30, 2012
problem in Matrix multiplication code CUDA Programming and Performance	0	660	January 22, 2014
Matrix multiplication performance issue CUDA Programming and Performance	14	268	June 12, 2025
Matrix Multiplucation CUDA Programming and Performance	0	694	June 27, 2011
Problems of matrix multiplication With and without CUDA CUDA Programming and Performance	15	10174	January 18, 2012

matrixMul problem printDiff is flipping plz help me

Related topics