Need help testing OpenCL program

I’m working on a seminar paper about potential of GPUs at performing image processing related algorithms. I’ve wrote a simple testing program, but I do not have resources to test it. All I have is a laptop with GeForce 9500M GS, and its not enough. I need help from someone with a good GPU. Program is written in C++ for Unix platform (tested on Mac and Ubuntu). Program also requires libpng, png++ and OpenCL dev files. If there is someone with those requirements (or willing to install those few libs) and a little of goodwill, I would really appreciate it. All you need to do is run ‘make’, then ‘make run’ and post results here (with your CPU/GPU specs). Try testing few times to get average results. Once again, I would really appreciate if someone could help.

Here is source. For more info consult README. Dropbox - Error

I run your codes on my desktop with GTX 570 and i5-2500K. The OS is OpenSUSE 12.1 64bit and OpenCL driver is 302.06.03. External Image

============

OpenCL device:

NVIDIA Corporation GeForce GTX 570

Compute units: 15

Global memory size: 1279 MB

Local memory size: 48 KB

Constant memory size: 64 KB

Max work group size: 1024 work-items

Max number of work items per dimension: [1024, 1024, 64]

Data transfer to the OpenCL device memory completed in 0.61 ms

Starting Non-Maximum Suppression algorithm test (n = 3)

Performing operations on the CPU and on the OpenCL device

CPU running time: 1.424 ms

OpenCL device running time: 0.177 ms

Starting Convolution 2D algorithm test

Performing operations on the CPU and on the OpenCL device

CPU running time: 3.854 ms

OpenCL device running time: 0.353 ms

===============

zhaopeng@linux-q1vg:~/Downloads/seminar> make run

Executing: ./bin/seminar ./resources/test_image.png 3

OpenCL device:

NVIDIA Corporation GeForce GTX 570

Compute units: 15

Global memory size: 1279 MB

Local memory size: 48 KB

Constant memory size: 64 KB

Max work group size: 1024 work-items

Max number of work items per dimension: [1024, 1024, 64]

Data transfer to the OpenCL device memory completed in 0.602 ms

Starting Non-Maximum Suppression algorithm test (n = 3)

Performing operations on the CPU and on the OpenCL device

CPU running time: 1.466 ms

OpenCL device running time: 0.18 ms

Starting Convolution 2D algorithm test

Performing operations on the CPU and on the OpenCL device

CPU running time: 5.81 ms

OpenCL device running time: 0.34 ms

================

zhaopeng@linux-q1vg:~/Downloads/seminar> make run

Executing: ./bin/seminar ./resources/test_image.png 3

OpenCL device:

NVIDIA Corporation GeForce GTX 570

Compute units: 15

Global memory size: 1279 MB

Local memory size: 48 KB

Constant memory size: 64 KB

Max work group size: 1024 work-items

Max number of work items per dimension: [1024, 1024, 64]

Data transfer to the OpenCL device memory completed in 0.945 ms

Starting Non-Maximum Suppression algorithm test (n = 3)

Performing operations on the CPU and on the OpenCL device

CPU running time: 1.918 ms

OpenCL device running time: 0.184 ms

Starting Convolution 2D algorithm test

Performing operations on the CPU and on the OpenCL device

CPU running time: 6.92 ms

OpenCL device running time: 0.342 ms

================

zhaopeng@linux-q1vg:~/Downloads/seminar> make run

Executing: ./bin/seminar ./resources/test_image.png 3

OpenCL device:

NVIDIA Corporation GeForce GTX 570

Compute units: 15

Global memory size: 1279 MB

Local memory size: 48 KB

Constant memory size: 64 KB

Max work group size: 1024 work-items

Max number of work items per dimension: [1024, 1024, 64]

Data transfer to the OpenCL device memory completed in 0.607 ms

Starting Non-Maximum Suppression algorithm test (n = 3)

Performing operations on the CPU and on the OpenCL device

CPU running time: 1.356 ms

OpenCL device running time: 0.155 ms

Starting Convolution 2D algorithm test

Performing operations on the CPU and on the OpenCL device

CPU running time: 4.002 ms

OpenCL device running time: 0.342 ms

================

zhaopeng@linux-q1vg:~/Downloads/seminar> make run

Executing: ./bin/seminar ./resources/test_image.png 3

OpenCL device:

NVIDIA Corporation GeForce GTX 570

Compute units: 15

Global memory size: 1279 MB

Local memory size: 48 KB

Constant memory size: 64 KB

Max work group size: 1024 work-items

Max number of work items per dimension: [1024, 1024, 64]

Data transfer to the OpenCL device memory completed in 0.592 ms

Starting Non-Maximum Suppression algorithm test (n = 3)

Performing operations on the CPU and on the OpenCL device

CPU running time: 1.297 ms

OpenCL device running time: 0.188 ms

Starting Convolution 2D algorithm test

Performing operations on the CPU and on the OpenCL device

CPU running time: 4.1 ms

OpenCL device running time: 0.343 ms

================

zhaopeng@linux-q1vg:~/Downloads/seminar> make run

Executing: ./bin/seminar ./resources/test_image.png 3

OpenCL device:

NVIDIA Corporation GeForce GTX 570

Compute units: 15

Global memory size: 1279 MB

Local memory size: 48 KB

Constant memory size: 64 KB

Max work group size: 1024 work-items

Max number of work items per dimension: [1024, 1024, 64]

Data transfer to the OpenCL device memory completed in 0.604 ms

Starting Non-Maximum Suppression algorithm test (n = 3)

Performing operations on the CPU and on the OpenCL device

CPU running time: 1.373 ms

OpenCL device running time: 0.224 ms

Starting Convolution 2D algorithm test

Performing operations on the CPU and on the OpenCL device

CPU running time: 4.066 ms

OpenCL device running time: 0.341 ms

You’ve been of great help. Thank you very much!

I was also able to gather data from few other cards, now I have enough data to fill my work. Results are quite impressive :)

Again, thank you a lot :)