fastest parallel sorting algorithm

Hey,

can any one tell me which is the best available parallel sorting algorithm in terms of speed?

Regards,
sagar

This paper was recently posted by someone else in the forum:

[url=“Distributed Computing and Systems Research Group”]Distributed Computing and Systems Research Group

It describes a CUDA-specific version of quicksort, which they show performs very favorably compared to other parallel sorting algorithms.