atomicAdd() ? GT240 and GTX570

Hello.

I read CUDA - Wikipedia

and then tried atomicAdd().

The result from GTX570, compile with -arch=sm_20 result is correct and

GT240, compile with -arch=sm_13 is incorrect.

Am I fall into the hardware limitation of GT240 ?

Floating-point atomic addition operating on

32-bit words in global and shared memory

GTX570 compute capability is 2.x

GT240 it may be 1.3

GTX570 = correct

============GPU===============

0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0

1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0

2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0

3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0

5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0

6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0

7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0

demand = 6

GT240 = incorrect

============GPU===============

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 

demand = 0

Any help would be appreciated.
ii.zip (1.67 KB)

GT 240 is compute capability 1.2 - no double precision support!

Presenting it with a sm_13 kernel will cause it to barf. Try sm_12.

It returns nan and abundant of junk.

But, surprising demand is correct!

How come ?

sarit@AH64D:~/cuda_example_sarit/sarit$ nvcc -arch=sm_12 tomtom.cu

tomtom.cu(20): warning: variable "d" was set but never used

tomtom.cu(20): warning: variable "l" was set but never used

tomtom.cu(20): warning: variable "r" was set but never used

tomtom.cu(20): warning: variable "d" was set but never used

tomtom.cu(20): warning: variable "l" was set but never used

tomtom.cu(20): warning: variable "r" was set but never used

./tomtom.h(36): Warning: Cannot tell what pointer points to, assuming global memory space

./tomtom.cu(6): Warning: Cannot tell what pointer points to, assuming global memory space

./tomtom.cu(28): Warning: Cannot tell what pointer points to, assuming global memory space

./tomtom.cu(28): Warning: Cannot tell what pointer points to, assuming global memory space

ptxas /tmp/tmpxft_000005af_00000000-2_tomtom.ptx, line 617; warning : Double is not supported. Demoting to float

sarit@AH64D:~/cuda_example_sarit/sarit$ ./a.out

============GPU===============

-19490470027253472396256342481609948194817939768793588743074538238665845062838028911966540957630605301838359666348414628363531728930723848946924521816876064709633565959261148517129972900549062402571056724043884882907168051152489644116733360539175519180734157576416100591372919654915516661760.0 nan -nan -nan -13407803147423818015473926521598890084348585413740038814987758327493301884004043104920944273857953534420332041331627572858807606985647829276780559769534464.0 9124843173895825187752051115509701099884622575854303542050768563631606893498255144340110934249830473226673586537837830197114649706875539803504546553242749962854349053771661835887672570855555072.0 -178364715964550230201873469966378800261843099133087594232260805768262514700924356974583947004896562983463691875585691104663043734698451172039792060271905723908017173640751308355790226336373324192098193087927842662830479617417186395084310950715054215530312192883374697619135642888109024563820798453916689760256.0 -1042363073090130565455000650743788751565975314820076107920619433747124335338672219694253471992621452393284633369537836565171969572854240113469216467318144894947968278038644851570075836652029014353351130050787554445531871940560295093607418610798354538244854756186278756191009593979307052644532172501865725952.0 

-nan -nan -nan nan -15426631908632328009708623425096572505555329015018926434034825502117017548925723958034933211628195868953288216363277749439729548751208974434523955299102691777205650178732047943726484856477986095992439638782182012028981684757260893422806929139715487066251208338784610067646663012133109760.0 -nan -nan -nan 

-nan -17053976567541219307642811613731093289689945153673613964053525133557291317749914266541243249301484770469094169762456463043221184437525872196380860574183328991333244059326808503172963302072674854976462681402941712521197699758709605156477175792087813685360116284332709249856273713867643682816.0 -nan -nan -4800356605867646501451304645868647505642769663192516342947977100604912914521829310679915520463477634022745627986514147943547067982090288680314178157667936631655759585771700527765546977095005433241086424350397896535234335418265050212149878887875445321403056991030778878371556588382937195750213617910284288.0 -nan -nan -nan 

-19490621061484270606632001798980747221278029760865218467093051099380633127831167378087025364038817768499920731049114046095859198503191324000488262949759976605074042365619521845863873536807278055333747505601850056920025388636658535442432500079479879178409691044693687642208205911502245330944.0 -nan -nan -nan -nan -nan -26815504029486876229358345213454129374362160809153522068185585659863763400455294213139939026608276811907180857892818539558055781991060380748622770779193344.0 -nan 

-363419232517372189382072643869218023182550854152814319577055122747829485926820923387385854420578372393972763329886775707786830782922752.0 -nan -nan -nan -1056588686373686602800426186491233093106176197595650827990679890345117547726305643858962033775627648330652395264315399573077795685987164027318243753187365242966415397768262858764576075786230370818082687079546107123798844740755595358484178206046010867380492861881425330176.0 -nan -12569815167411148448330511116119010850346956578696004942204891491556476430251305112969655230824052585507164905994106850287481103787141047051569883238105088.0 -nan 

-12988809152734860406428230471786554622658021537595124664598868183236153717689616500715651322160543905368510516923126814891034421043222474645242793544384512.0 -0.0 -nan -nan -nan -89171428657265522626583411884510768330813304325046713859205586441327129088315961040694268806320203902711123659292836876496784655255620818500185351858923803634458035303049441517900712102291107491521711940139263010063763244485425887912909780169687718220028524336345712898001390819261557649875421293801166077952.0 -nan -nan 

-178364801685238805103259145840382725061987944045472530674949401268293584329008446969473746460766868239132342083159524508914789749670074027424915548148526321496615605117293506949638109704970164691067328111561299887202279486072716534274451424039405784146815509452946519111472984171547677784815893151562034315264.0 -nan -nan -nan -nan -nan -1056072774237637052600830063941415348090517293416746154789587067494931500966359914326195517386216332882259599675745633109641181841659344001480056338119009130513393625965533623960984940221832761160076876843898876205390585439526148261106044901602684419035555319504883941376.0 -nan 

-nan -nan -nan -26710857826612444994267353286715269293757901339492680983361762839799097082776854012283280105964539166446755420369442534659386490716307082768737459719110656.0 -nan -nan -3105035081744894603895196328383427182024861815355852734705236516492512649426657589363629046416162337475795204343460529098490544426894750345132314555379837737778820899835763796538369436274338883220002723483918854951494219037908402176.0 -nan 

demand = 6

maybe you’re printing %lf instead of %f?

Also make sure you’re using floats on the host side code. (maybe use #if/#else and a typedef to allow compilation with floats or doubles as needed)

Christian

where did the edit buttons go? can’t edit my previous postings now.

Ah, this is what happend to the “edit” feature:

Amorphous@NVIDIA Icon
Posted 11 July 2011 - 07:26 PM
Apologies for the inconvience and confusion! I’ve temporarily remove the ability to edit posts. This will be restored at the completion of the current giveaway to prevent abuse. Thanks in advance for your understanding.

Thank you Christian.

I always presume that phrase “Demoting to float” is already convert double to float.

:]

============GPU===============

0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 

1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 

2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 

3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 

4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 

5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 

6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 

7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 

demand = 6

21July2011.zip (1.83 KB)