shared memory reverse array example

BlahCuda · March 22, 2010, 6:24pm

Greetings,

In the Dr Dobbs CUDA tutorial, there exists two reverse array GPU kernels with the first one not utilizing shared memory and the second one utilizing shared memory. Here is the link below.

[url=“CUDA, Supercomputing for the Masses: Part 5 | Dr Dobb's”]http://www.drdobbs.com/high-performance-co...TMY32JVN?pgno=2[/url]

I have timed the two kernels and don’t see much difference in performance between the two. Is this normal? What might be the problem (e.g. automatic optimizations, not setting a flag in the compilation command, something wrong with my timing, etc.)? Thanks.

seibert · March 23, 2010, 4:10pm

What GPU are you using? Compute capability 1.2 and greater devices should be able to reverse an array efficiently without shared memory thanks to the improved memory controller.

Topic		Replies	Views
Why is the performance more? Refering to Dr Dobbs article CUDA Programming and Performance	10	2774	April 23, 2010
reverse large array CUDA Programming and Performance	4	4667	April 28, 2009
Incorrect result of reversing array Compiller error? CUDA Programming and Performance	1	1957	July 16, 2008
Wrong indexing? CUDA Programming and Performance	4	1196	March 4, 2010
Optimizing Array Reversal CUDA Programming and Performance	8	9556	January 19, 2010
Using Shared Memory in CUDA C/C++ Technical Blog	36	2299	October 8, 2020
Regarding repeatability of results CUDA Programming and Performance	4	4041	June 6, 2011
Reverse array low instruction throughput :( CUDA Programming and Performance	1	4732	October 31, 2010
why is shared memory example not faster CUDA Programming and Performance	1	1143	April 23, 2012
why is shared memory example not faster CUDA Programming and Performance	7	1403	May 16, 2012

shared memory reverse array example

Related topics