GTX 280s give different calculation results

Lomohai · August 6, 2008, 4:02pm

I have 2 GTX280s doing some math calculations. When I run code on one of the GTX280s I get certain results. When I run the exact same code on the other GTX280 I get different values for a small number of results ( < 50 out of 35000). I was wondering if anyone else has experienced something similar.

SPWorley · August 6, 2008, 8:10pm

I found I got different results on my same card when I overclocked it. But it wasn’t a hardware error! It was a race condition in my global memory writes, and different clock speeds made the race winners change.

You could diagnose your problem by comparing outputs vs the emulator, then cutting down your example to be as simple as possible. It certainly could be a bad board, but it’ll take some experiments to tell.

As an aside, it’d be cool if someone made a memory and execution tester in CUDA… kind of like memtest86 and Prime95 are used for CPUs.

Sarnath · August 7, 2008, 5:43am

If you are looking @ any commercial angle – YOu should always go only for TESLA! – They are the ones certified for computation! (NVIDIA guys – correct me if I am wrong here)

And, Yes, it would be a great idea to have a memtest tool! If some1 already has it – pass it on please!!!

There should be a way to MMAP the whole graphics memory into a windows application (except the one currently used for display) and test the same!!! It would be slow! But it would be definitely helpful!

Lomohai · August 7, 2008, 1:29pm

That might have something to do with it…the 2 cards are running at different clock speeds. However, I changed threads/block from 512 to 256 and results are consistent and correct across both cards now. Weird.

Geka · August 7, 2008, 3:44pm

Have you tried comparing the outputs of your two cards with the output of the “emu” device? since that one is way more tested for computation than your cards (obviously) you would be able to know what the error rate is.

Lomohai · August 7, 2008, 5:45pm

The problem with emu mode is that math operations are in double precision, as far as I can tell. I’ve found it quite rare that emu mode will return the same result as running on the card.

tmurray · August 7, 2008, 6:02pm

it won’t if you actually use single-precision arithmetic in your code–e.g., you append your constants with f. there’s no reason (short of MADs causing different results) that the card and emulation should return different results. if they do return different results and you ever compile with -arch sm_13, your performance will tank because it will probably start using DP on the card.

tmurray · August 7, 2008, 11:00pm

Yeah, I was wrong. Plus the MAD issue, here are some other reasons why execution on the card and -deviceemu will differ:

If x87 registers are used for storage of intermediate computations, those computations will use double precision, regardless of whether single-precision variables are used.
FDIV and FSQRT on the GPU are not IEEE-754 compliant for single precision.
Transcendentals return different values (some transcendental functions are mapped to double-precision versions in deviceemu).
Toolchain differences result in different orders of operations (thanks to different optimizations), which can cause different results.

Most of these should have occurred to me and I feel dumb for missing them–oh well.

Sarnath · August 8, 2008, 8:54am

The most important of all – Deviceemu does NOT emulate the hardware fully!! For example - in deviceemu there is no concept of warps! This can change a result to great extent!

None of my code will yeild correct results in deviceeemu – not because of precision problem but because of the in-correct emulation!

This is the most important point regarding device emulation. Rest all come next!

For example:

Consider only 1 WARP doing global_mem++; The global memory value will just be increased only by 1. In case of device-emu, it iwll be increased by 32 and so on…

mandana · September 16, 2008, 2:51pm

i am having similar probelm. the only difference is that i run the same code on the same host and device. but each time i run it, it gives me different results, sometimes the correct one and some times wrong.

the code does some computations on a matrix, when i choose the input a matrix of a small size(upto 127), gives me the right answere but when i choose larger matrices like 255 it gives me different result.

can anyone please help me??

Topic		Replies	Views
Different Output on Device and Emulation Mode 2 What makes difference b/w Emu and Dev? CUDA Programming and Performance	15	13579	December 2, 2007
Difference between Device emulation and execution modes CUDA Programming and Performance	0	1115	April 11, 2009
Difference between Device emulation and execution modes CUDA Programming and Performance	8	4521	May 12, 2009
Different results between GPU and CPU different when program runs on Tesla card and same results wh CUDA Programming and Performance	7	1110	October 8, 2010
Different results with and without emulation mode CUDA Programming and Performance	6	1723	February 1, 2010
Emulation/CPU=correct,Execution/GPU=incorrect emulation CUDA Programming and Performance	26	21674	September 2, 2008
Different output from emulation and device precision issues on GPU vs CPU CUDA Programming and Performance	9	7318	August 20, 2009
emu vs debug, different values CUDA Programming and Performance	48	16054	February 5, 2009
Different results in emu vs. release mode CUDA Programming and Performance	2	1233	October 27, 2008
Emulator works but G80 doesn't CUDA Programming and Performance	11	5522	July 3, 2007

GTX 280s give different calculation results

Related topics