bubble sort in CUDA

i need the implementation of bubble sort in CUDA .could you help me please?
your help will be appreciated.

OK for bubble I expect you have an array that is small, i.e. a lot less than 1000 elements.
I suggest…
supposing your array is 64 elements long, then assign 32 threads,
for each thread N = 2* its_thread_number
on each go round a loop each thread should do this
if ( A[N] < A[N+1] ) swap(A[N],A[N+1] )
if ( A[N+1] < A[N+2] ) swap(A[N+1],A[N+2] )

Why have N = 2threadNumber?
Reason is that in cuda all threads in a warp do the same instruction in a single instruction cycle.
so thread 0 will be swapping A[0] with A[1] at the same time as thread 1 swaps A[2] and A[3]
threadNumber stops them from both trying to update contents of a cell at same time.

Probably simplest to stop after length_of_array iterations or is it length_of_array/2 up to you to work that out.

You will also have to work out what __synchthreads() are needed if the array can be >64 long.
and handle case where array is not an even number of elements. e.g. what if array is 13 long ?

And there may be better ways.

shouldn’t bitonic sort be faster even for very small inputs?

Would u provide the program please

Im doing a lil research on sorting algorithm implementation on CUDA .

doesn’t have to be fast, just meaningful result that’s all .

thanks ^^,

The purpose of this forum is to help with CUDA problems, not to do your assignments for you. ;) Most of the regulars here have their own projects to work on.

I just need for some reference .
I’ve got my stuffs going on already .
Code optimization is what i need for now .
a lil help for reference would be greatly appreciated .

thanks anyway =)

There is plenty of Cuda sorting code available. Take a look at Thrust, for example.

thanks for the tips bro