If I have a histogram defined in c code like this:
const unsigned int myS = 256;
for ( unsigned int k = 0; k < AN; k++ ) {
for ( unsigned int y = 0; y < myHeight; y++ ) {
for ( unsigned int x = 0; x < myWidth; x++ ) {
histogram[ (k * myS) + myImage[ (k * myHeight * myWidth) + (y * myWidth) + x ] ] += 1;
The right implementation in cuda wil be:
int y = threadIdx.y + blockIdx.y * blockDim.y;
int x = threadIdx.x + blockIdx.x * blockDim.x;
int Idx = x + y * myWidth;
if ( x < myWidth && y < myHeight )
{
for ( unsigned int k = 0; k < AN; k++ )
{
int theIdx = devImage[ Idx + (k * myWidth * myHeight) ];
atomicAdd( &( devHistogram[ (k * myS) + theIdx ] ) , 1 );
}
Is this right?
Also, is there an easy solution for not using atomicAdd?
They ask you to initialize the devHistogram array with 0. After allocating the array contains “random” data (actually not random, but the values that were there before the allocation). Then in your code you add values to these random numbers and that might be the source of your problem.
If you run the code again and the memory allocation gives you the same space in memory, you will start from the values of the previous run.
Ok,I understood now , but the problem was that after each compilation and execution of the code the value was increasing!
If I initialize the histogram or not ,it doesn’t matter ( I mean to have such an error)
I found the error.It seems that it was in the “AN” variable , in it’s size.
If someone can help with the use of shared memory!