Help with sorting


I was hoping to modify an SDK existing SDK example to take care of my sorting problem. However, the bitonic and merge sort examples in the SDK require a power of 2 input. I could pad the data but it makes the memory requirement quite large.

I spend the whole day trying to modify the radix sort example but it is too complicated for me to try and understand and tweak…

Does anyone know of a sorting algorithm on CUDA that will work for arrays of arbitrary inputs? It does not have to be the fastest, just competitive enough.

I basically have a pair of floating point keys (all positive) and an array of 3D points. The x, y, z coordinates are laid in sequence.

Any ideas on how to sort such a setup?


You should look at CUDPP - it comes with all the source and includes radix sort.

I am trying to to use the radix sort that comes with the CUDA SDK (it uses CUDPP). However, it only seems to handle unsigned int values… I can understand it only being able to handle unsigned int keys but it does make sense to me to only be able to sort a value array with unsigned int. Such a pain…

No, radix sort can work with floats. It sorts floats by doing a monotonic mapping between float and int. Look at the options.

That is what is strange. The CUDPP implementation seems to handle float keys but not float values… which sounds weird to me.


P.S: What do you mean by monotonic mapping?

Sort shouldn’t care about the extra data type, only the key type.

cudppSort(CUDPPHandle planHandle, void *d_keys, void *d_values, int keybits, size_t numElements)

probably will work with any 4 byte element used for d_values. I guess the main way d_values is used is to sort indices [0 1 2 3 4 5 6…] to get the permutation that the sort produced, so that other auxillary arrays can be permuted.

It’s clear to sort floats as ints, the mapping between float & int has to preserve relative values:

A mapping such that for integer a <==> float b, integer c <==> float d,

  1. if a < c, then b < d

  2. if a > c, then b > d

  3. if a = c, then b = c

Fortunately, it’s easy to do this with IEEE floating point. If you convert float to int by directly taking the binary representation, it’s clear all positive floats are monotonic with respect to its integer bit representation:

if you increase float x, its integer representation increases too. To make it work for negative #s, you need to flip the sign bit & do a subtract.

Here’s how you would do it with Thrust


#include <thrust/sort.h>

int main(void)


int N = ... // number of values to sort

float * keys = ...

float3 * values = ...

thrust::device_ptr key_ptr(keys);

thrust::device_ptr<float3> val_ptr(values);

thrust::sort_by_key(key_ptr, key_ptr + N, val_ptr);

return 0;



I don’t think it gets much simpler than that :)

I’m trying to figure out… I have a list of photon-structs… having a couple of float 3’s and some more…

is it possible to sort this list… I have to sort by [photonlist].position.x . .then find the middle, put the photon in a new list… … split it up, then sort the 2 new lists by [photon].position.y … split and then z. and so on (for at kd tree build) ?