CPU Vs GPU output

Siraj · September 28, 2010, 4:48am

Hi,
I have tested some program on cuda( cuda 2.3 using device emulation ). I notice that there is some minor difference in outputs of simple C program & respective Cuda program (Program contains the float calculation ) .
I want to know that I getting this difference because of device emulation mode or any other reason ?

Thanks in advance for all reply..

Siraj · September 28, 2010, 4:48am

Hi,
I have tested some program on cuda( cuda 2.3 using device emulation ). I notice that there is some minor difference in outputs of simple C program & respective Cuda program (Program contains the float calculation ) .
I want to know that I getting this difference because of device emulation mode or any other reason ?

Thanks in advance for all reply..

laughingrice · September 28, 2010, 7:45am

Hi,

I have tested some program on cuda( cuda 2.3 using device emulation ). I notice that there is some minor difference in outputs of simple  C program & respective Cuda program (<u>Program contains the float calculation</u> ) . 

  I want to know that  I getting this difference because of  device emulation mode or any other reason ?

Thanks in advance for all reply…

First of all, there shouldn’t be any device emulation actually enabled on 3.2. It’s supposed to just ignore that flag. A lot of people will be happy to hear otherwise (as there has been a lot of noise about it).

Second, what do you mean minor difference? NEVER expect two different floating point codes to be the same. Different compilers, different optimization flags, and certainly different hardwares will give you different results. In general with floating point numbers (a + B) + c != a + (b + c).

These days (ansi c) compilers are not even allowed to optimize for that. It does mean though that code reordering or different choice of mad vs fmad vs mul + add will give different result.

Lastly, NVIDIA is not fully ieee compliant (last bit may give different results)

laughingrice · September 28, 2010, 7:45am

Hi,

I have tested some program on cuda( cuda 2.3 using device emulation ). I notice that there is some minor difference in outputs of simple  C program & respective Cuda program (<u>Program contains the float calculation</u> ) . 

  I want to know that  I getting this difference because of  device emulation mode or any other reason ?

Thanks in advance for all reply…

First of all, there shouldn’t be any device emulation actually enabled on 3.2. It’s supposed to just ignore that flag. A lot of people will be happy to hear otherwise (as there has been a lot of noise about it).

Second, what do you mean minor difference? NEVER expect two different floating point codes to be the same. Different compilers, different optimization flags, and certainly different hardwares will give you different results. In general with floating point numbers (a + B) + c != a + (b + c).

These days (ansi c) compilers are not even allowed to optimize for that. It does mean though that code reordering or different choice of mad vs fmad vs mul + add will give different result.

Lastly, NVIDIA is not fully ieee compliant (last bit may give different results)

laughingrice · September 28, 2010, 7:45am

Hi,

I have tested some program on cuda( cuda 2.3 using device emulation ). I notice that there is some minor difference in outputs of simple  C program & respective Cuda program (<u>Program contains the float calculation</u> ) . 

  I want to know that  I getting this difference because of  device emulation mode or any other reason ?

Thanks in advance for all reply…

First of all, there shouldn’t be any device emulation actually enabled on 3.2. It’s supposed to just ignore that flag. A lot of people will be happy to hear otherwise (as there has been a lot of noise about it).

Second, what do you mean minor difference? NEVER expect two different floating point codes to be the same. Different compilers, different optimization flags, and certainly different hardwares will give you different results. In general with floating point numbers (a + B) + c != a + (b + c).

These days (ansi c) compilers are not even allowed to optimize for that. It does mean though that code reordering or different choice of mad vs fmad vs mul + add will give different result.

Lastly, NVIDIA is not fully ieee compliant (last bit may give different results)

Topic		Replies	Views
precision difference between emulation mode and cpu CUDA Programming and Performance	1	3918	February 11, 2010
Different output from emulation and device precision issues on GPU vs CPU CUDA Programming and Performance	9	7260	August 20, 2009
discrepancy between results from emulation mode and actual device mode CUDA Programming and Performance	1	5479	November 25, 2009
emulation mode may give us a wrong result? CUDA Programming and Performance	2	909	May 16, 2009
Different results with and without emulation mode CUDA Programming and Performance	6	1674	February 1, 2010
Accuracy difference Device vs. Emu CUDA Programming and Performance	7	3062	June 25, 2008
Difference between Device emulation and execution modes CUDA Programming and Performance	8	4464	May 12, 2009
Different Outputs with -deviceemu mode CUDA Programming and Performance	3	4726	April 8, 2009
Program gives wrong answer except with emulation CUDA Programming and Performance	8	3904	April 21, 2008
Different Output on Device and Emulation Mode 2 What makes difference b/w Emu and Dev? CUDA Programming and Performance	15	13496	December 2, 2007

CPU Vs GPU output

Related topics