Hi,
I’ve posted this already in the “OpenCL” forums, but haven’t received any feedback … I would like to get an example, written in both the CUDA and OpenCL implementation to make a basic comparison of performance between CUDA and OpenCL. Does anybody have such an example? For instance, an implementation of the N-Body problem or something similar … the example should just output the time of execution so that I could make some basic comparison of performance in dependence of different input problem sizes…
Thanks!