Hi everybody,

I am using the following scheme to perform bit reversal (This is how I am making it work on the CPU)

```
for (.... x ....)
{
reverse[x] = bit_reverse(x, bits_required_for_storage);
}
for (.... x ....)
{
if ( x < reverse[x] )
swap( data[x], data[reverse[x]] )
}
```

The bit_reverse() function is irrelevant to discuss here, but I have it running on the GPU and after execution, it is able to give a correct sequence for me to swap with.

I am having problems in the swapping function.

Following is my code for the swap:

```
__kernel void swap(__global double *data, const int sizeX, const int sizeY, __global int *reverse)
{
int idX = get_global_id(0);
int idY = get_global_id(1);
int BASE = idY * sizeX;
if (idX < reverse[idX])
swap(data[BASE+idX], data[BASE+reverse[idX])
}
```

This, I ran with (which is fine for smaller sizes, which I am dealing with at the moment anyway):

globalSize[0] = size of X, globalSize[1] = size of Y, globalSize[2] = 1;

localSize[0] = size of X, localSize[1] = size of Y, localSize[2] = 1;

Having problems with this, I also tried:

globalSize[0] = 1, globalSize[1] = size of Y;

localSize[0] = 1, localSize[1] = size of Y;

```
__kernel void swap(__global double *data, const int sizeX, const int sizeY, __global int *reverse)
{
int idX = get_global_id(0);
int idY = get_global_id(1);
int BASE = idY * sizeX;
__private int x;
for (.... x ....)
{
if (x < reverse[x])
swap(data[BASE+x], data[BASE+reverse[x])
}
}
```

The right sequence I should be getting for data:

5 6 7 8

1 2 3 4

with reverse:

0 2 1 3

should be:

5 7 6 8

1 3 2 4

But instead, I am getting values like:

11046 9090 9090 9130

28208 27020 27020 27044

Can anybody guide me why this is happening?