 # Quicksort problem

Hi,

I have a matrix (float*) of size width*height.

I’d like to sort each column independently.

So I’d like to use quicksort, but a modified version.

I use a 1D grid where each thread correspons to one column.

Each thread use the modified quicksort.

The quicksort algorithm use recursive function.

I paste the code bellow.

The problem is that I cannot compile this code.

Vince

``````__device__ int partition(float* a, int w, int l, int r) {

float pivot, t;

int i, j;

pivot = a[l*w];

i = l; j = r+1;

while(1)

{

do ++i; while( a[i*w] <= pivot && i <= r );

do --j; while( a[j*w] > pivot );

if( i >= j ) break;

t = a[i*w]; a[i*w] = a[j*w]; a[j*w] = t;

}

t = a[l*w]; a[l*w] = a[j*w]; a[j*w] = t;

return j;

}

__device__ void quickSort(float* a, int w, int l, int r)

{

int j;

if( l < r )

{

// divide and conquer

j = partition( a, w, l, r);

quickSort( a, w, l, j-1);

quickSort( a, w, j+1, r);

}

}

// Sort the column of matric tab.

__global__ void cuParallelQuickSort(float *tab, int width, int height, int pitch){

unsigned int xIndex = blockIdx.x * BLOCK_DIM + threadIdx.x;

if (xIndex<width){

quickSort(&tab[xIndex], width, 0, height-1);

}

}
``````

I think i can remember that device functions does not support recursions.

Stefan

Yes, CUDA does not support recursion (all functions are inlined).

There is a bitonic sort example in the SDK.

I am curious though. In CUDA programming manual it is mentioned that it is possible, though not always, to force a function not to be inlined via noinline construct (or something like that). Would it be possible the have recursive call of this function. I suppose not, as this is described in CUDA manual; nevertheless, I am curious of limitations which do not allows this. Stack depth? …

Evghenii