System:
QX9650 3.0 ghz quad
ASUS P5E
8 GB memory
single EVGA 480 GTX on X without composite.
[codebox]
x@desktop:/home/x/fermi/fermi_test$ nvcc -arch sm_20 pdfs.cu
time x@desktop:/home/x/fermi/fermi_test$ time ./a.out
Device name: GeForce GTX 480
BogoGFLOPS: 1345.0
Single precision: time = 170.379 ms, efficiency metric = 10.30
Double precision: time = 882.484 ms, efficiency metric = 1.99
Atomic abuse: time = 0.105 ms, events/sec = 1469687.8, events/sec/bogoGFLOP = 1092.74
real 0m2.204s
user 0m2.108s
sys 0m0.092s
x@desktop:/home/x/fermi/fermi_test$ time ./a.out
Device name: GeForce GTX 480
BogoGFLOPS: 1345.0
Single precision: time = 170.401 ms, efficiency metric = 10.29
Double precision: time = 882.486 ms, efficiency metric = 1.99
Atomic abuse: time = 0.105 ms, events/sec = 1467441.1, events/sec/bogoGFLOP = 1091.07
real 0m2.211s
user 0m2.112s
sys 0m0.096s
x@desktop:/home/x/fermi/fermi_test$ time ./a.out
Device name: GeForce GTX 480
BogoGFLOPS: 1345.0
Single precision: time = 170.388 ms, efficiency metric = 10.30
Double precision: time = 882.493 ms, efficiency metric = 1.99
Atomic abuse: time = 0.105 ms, events/sec = 1467441.1, events/sec/bogoGFLOP = 1091.07
real 0m2.207s
user 0m2.116s
sys 0m0.088s
x@desktop:/home/x/fermi/fermi_test$ nvcc -arch sm_20 rayleigh.cu
x@desktop:/home/x/fermi/fermi_test$ time ./a.out
Device name: GeForce GTX 480
BogoGFLOPS: 1345.0
Rayleigh power: time = 948.915 ms, event*freq/sec = 2697817.5
real 0m1.991s
user 0m1.896s
sys 0m0.092s
x@desktop:/home/x/fermi/fermi_test$ time ./a.out
Device name: GeForce GTX 480
BogoGFLOPS: 1345.0
Rayleigh power: time = 948.944 ms, event*freq/sec = 2697736.5
real 0m1.996s
user 0m1.900s
sys 0m0.092s
x@desktop:/home/x/fermi/fermi_test$ time ./a.out
Device name: GeForce GTX 480
BogoGFLOPS: 1345.0
Rayleigh power: time = 948.941 ms, event*freq/sec = 2697743.2
real 0m1.999s
user 0m1.896s
sys 0m0.100s
x@desktop:/home/x/fermi/fermi_test$ nvcc -arch=sm_20 -DCUDA_ARCH=20 -o gpu_binning gpu_binning.cu
./gpu_binning.cu(577): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(573): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(644): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(577): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(573): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(644): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(577): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(573): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(644): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(577): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(573): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(644): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(577): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(573): Advisory: Loop was not unrolled, cannot deduce loop trip count
./gpu_binning.cu(644): Advisory: Loop was not unrolled, cannot deduce loop trip count
x@desktop:/home/x/fermi/fermi_test$ time ./gpu_binning
Running gpu_binning microbenchmark: 64000 3.800000 0.200000
sorting…
done.
Host : 1.805127 ms
Host w/device memcpy: 2.698395 ms
GPU/simple : 0.118869 ms
GPU/simple/sort/ 32 : 0.306458 ms
GPU/simple/sort/ 64 : 0.219647 ms
GPU/simple/sort/128 : 0.243106 ms
GPU/simple/sort/256 : 0.283932 ms
GPU/simple/sort/512 : 0.333723 ms
GPU/update : 1.536448 ms
real 0m58.531s
user 0m58.408s
sys 0m0.108s
x@desktop:/home/x/fermi/fermi_test$ time ./gpu_binning 64000 1.12 0.2
Running gpu_binning microbenchmark: 64000 1.120000 0.200000
sorting…
done.
Host : 2.724192 ms
Host w/device memcpy: 26.939022 ms
GPU/simple : 0.090810 ms
GPU/simple/sort/ 32 : 0.312103 ms
GPU/simple/sort/ 64 : 0.216333 ms
GPU/simple/sort/128 : 0.234767 ms
GPU/simple/sort/256 : 0.279113 ms
GPU/simple/sort/512 : 0.328357 ms
GPU/update : 1.223797 ms
real 1m23.604s
user 1m23.409s
sys 0m0.164s
x@desktop:/home/x/fermi/fermi_test$ time ./gpu_binning
Running gpu_binning microbenchmark: 64000 3.800000 0.200000
sorting…
done.
Host : 1.297337 ms
Host w/device memcpy: 2.168851 ms
GPU/simple : 0.115417 ms
GPU/simple/sort/ 32 : 0.303133 ms
GPU/simple/sort/ 64 : 0.216431 ms
GPU/simple/sort/128 : 0.239582 ms
GPU/simple/sort/256 : 0.280600 ms
GPU/simple/sort/512 : 0.330431 ms
GPU/update : 1.530422 ms
real 0m41.816s
user 0m41.691s
sys 0m0.116s
x@desktop:/home/x/fermi/fermi_test$ time ./gpu_binning 64000 1.12 0.2
Running gpu_binning microbenchmark: 64000 1.120000 0.200000
sorting…
done.
Host : 2.351914 ms
Host w/device memcpy: 27.406826 ms
GPU/simple : 0.090793 ms
GPU/simple/sort/ 32 : 0.312163 ms
GPU/simple/sort/ 64 : 0.216422 ms
GPU/simple/sort/128 : 0.234768 ms
GPU/simple/sort/256 : 0.279076 ms
GPU/simple/sort/512 : 0.326411 ms
GPU/update : 1.222879 ms
real 1m10.587s
user 1m10.372s
sys 0m0.184s
[/codebox]