Extremely low performance / kernel dies

Hi,

I’ve ported Pyrit’s CUDA-kernel to OpenCL. While the new kernel does in fact spit out correct results, the performance is about 1/50th of what the CUDA-API does (8800M GT, CUDA: 1.200 PMKs/s, OpenCL 28 PMKs/s). Also the OpenCL-kernel seems to die silently with the process hanging if the kernel-runtime exceeds a certain value; I need to SIGKILL the whole process in that case. Maybe anyone want’s to take a look? :-)

Here is the OpenCL-code, it does around 28 PMKs/s: [url=“Google Code Archive - Long-term storage for Google Code Project Hosting.”]Google Code Archive - Long-term storage for Google Code Project Hosting.
And here is the original CUDA-code, it does around 1.200 PMKs/s: [url=“http://code.google.com/p/pyrit/source/browse/#svn/trunk/cpyrit_cuda”]Google Code Archive - Long-term storage for Google Code Project Hosting.

regards

i just had a quick look on your opencl api code. it seems that you use just one gpu processing unit in _cpyrit_opencl.c:

lWorksize[0] = 1;

try a larger number. e.g. 256 or 512 dependend on the register usage of your kernel.

I was under the impression that this is scaled automatically. I’ve changed the code to use workgroups of 128 threads and it now performs just as fine as cuda does. Thanks :-)

if you want to scale automacigally, you can find the following information in the specification:

"local_work_size can also be a NULL value in which case the OpenCL implementation will

determine how to be break the global work-items into appropriate work-group instances."