Good parallel sort/search algorithms wanted! :) Applicable on CUDA

Hi All!

Could you please advice good parallel sort/search algorithms for use on 8800GTX ?

Don’t know about the parallel search algorithms.
But for parallel sort, ‘shell sort’ algorithm is a good candidate (as it is easier to implement and the algorithm is also easy to understand :) )
ref. for shell sort:
http://www.inf.fh-flensburg.de/lang/algori…ell/shellen.htm
I’m currently trying to implement shell sort in CUDA.

There is also a radix sort algorithm in the SDK. Another algorithm (or is it the same one?) is implemented in CUDPP.

A search could be done with a standard reduction kernel.

What is “reduction kernel”?

In short, a reduction kernel reduces an entire array to a single value in parallel (i.e. sum up all values). Instead of summing for the reduction, your sort could trickle down the index of the found value.

Check the CUDA SDK again. There is a very nice example code for a sum reduction with a whitepaper describing more details than you will want to know.

modern sorting algorithms are almost all a serialized version of a parallel idea, thus making sort quite easy to run on cuda.

Just learn a bit of the Divide et Impera paradigm.

The merge-sort would be the best for learning purpose…

The quick-sort is your best bet on a random flux of data…

The radix sort is useful, but it works on bits… this could be a problem sometimes…

other sorting algorithms could be useful, if some assumptions are made on the data input.

Thanks a lot!

But now I seek search algorithm rather than sort.
Could you please advice any?

Keep searching… You will succeed. Good luck.

We already did! Use a reduction. Perhaps it is not obvious to you why you need the reduction. Consider searching for the integer 5 in an array of 10 million values. In parallel, just launch 10 million threads that say if (array[idx] == 5) found[idx] = true. The problem is that now you need to find the true value in found, so perform a reduction which percolates the index of the found element up the tree so that you can access it directly.

(if array[idx] == 5){ found = idx}; – Wont this work?

I suppose if you could guarantee that there was only one element in the array you were looking for.

Right… No wonder, I dont understand reduction either.

I need to search word in a text - how may I use a reduction to int (4 bytes) if the word length != 4 bytes ???

I need to search a word in a text.

How may I use a reduction if the word length != sizeof(int) ???

I do not understand, reduction return only one value. What happen if we have more than one values that equal to the query number. Do you suggest segmented reduction, it sounds a little bit over kill to me