Swap causing problems - new to CUDA

I am working on a program where during some point I need to swap 2 numbers. These 2 numbers belong to a array declared locally in the kernel, here is the declaration

__global__ void dosomework(int *p)

{

        int store[10];

 . . 

          // populate store :: verified works correctly

 . . .

        work(store)

        // access store value in CPU after copying it to p

}

__device__ void work(int *a)

{

 ...

  ...

  ...        

        if (a[j+1] < a[j])

                {

                        swap(a[j+1],a[j]);

               }

}

// copied this from the sdk 

__device__ void swap(int & a, int & b)

{

    int tmp = a;

    a = b;

    b = tmp;

}

     

Problem is that the swap does not work, I get wrong values if I do it. However, if I comment the call to swap, the array elements are in order (i have confirmed this)

Can anyone help me out ?

My assumptions::

  1. int store[10] is local for each thread and so are all variables I use in the function, hence no race condition between threads

  2. call to function() with parameter store should also not cause problems

Thanks a lot,