simple CUDA program


I need a simple working CUDA program example that involve threads or something, proving programming on GPU is faster than CPU.

It would be great if i have compiling directions also.

Thanks in advance…

The CUDA SDK has about 50 full programs. About half of them have a CPU version to verify correctness. You could use those for timing comparisons if you like.

You might try the radixSort, matrixMul, or MersenneTwister projects, for just three examples.