Hi everybody,
I am using the following scheme to perform bit reversal (This is how I am making it work on the CPU)
for (.... x ....)
{
reverse[x] = bit_reverse(x, bits_required_for_storage);
}
for (.... x ....)
{
if ( x < reverse[x] )
swap( data[x], data[reverse[x]] )
}
The bit_reverse() function is irrelevant to discuss here, but I have it running on the GPU and after execution, it is able to give a correct sequence for me to swap with.
I am having problems in the swapping function.
Following is my code for the swap:
__kernel void swap(__global double *data, const int sizeX, const int sizeY, __global int *reverse)
{
int idX = get_global_id(0);
int idY = get_global_id(1);
int BASE = idY * sizeX;
if (idX < reverse[idX])
swap(data[BASE+idX], data[BASE+reverse[idX])
}
This, I ran with (which is fine for smaller sizes, which I am dealing with at the moment anyway):
globalSize[0] = size of X, globalSize[1] = size of Y, globalSize[2] = 1;
localSize[0] = size of X, localSize[1] = size of Y, localSize[2] = 1;
Having problems with this, I also tried:
globalSize[0] = 1, globalSize[1] = size of Y;
localSize[0] = 1, localSize[1] = size of Y;
__kernel void swap(__global double *data, const int sizeX, const int sizeY, __global int *reverse)
{
int idX = get_global_id(0);
int idY = get_global_id(1);
int BASE = idY * sizeX;
__private int x;
for (.... x ....)
{
if (x < reverse[x])
swap(data[BASE+x], data[BASE+reverse[x])
}
}
The right sequence I should be getting for data:
5 6 7 8
1 2 3 4
with reverse:
0 2 1 3
should be:
5 7 6 8
1 3 2 4
But instead, I am getting values like:
11046 9090 9090 9130
28208 27020 27020 27044
Can anybody guide me why this is happening?