Hi all, i am seeking help to understand this :
doRsltIdx get wrong when (pRegionInfo[blockIdx.x].pixelIndexE -pRegionInfo[blockIdx.x].pixelIndexB) = random of (0, 40000)
global void RegionCal(REGION_INFO* pRegionInfo, int* diRsltMaxNum,
int* doRsltNum, int* doRsltIdx)
{
int blockPixelNum = pRegionInfo[blockIdx.x].pixelIndexE -pRegionInfo[blockIdx.x].pixelIndexB + 1;
int threadPixelNum = (blockPixelNum - 1) / blockDim.x + 1;
for (int i = 0; i < threadPixelNum; ++i)
{
if (atomicAdd(doRsltIdx, 1) < *diRsltMaxNum)
{
atomicAdd(doRsltNum, 1);
}
}
}
void CalRegion(REGION_INFO* pRegionInfo, int* iRegionNum, int* diRsltMaxNum, int* doRsltNum, int* doRsltIdx)
{
rsltNum = 0;
dim3 blockSize(1024);
dim3 gridSize(iRegionNum);
RegionCal << <gridSize, blockSize >> > (pRegionInfo, diRsltMaxNum, doRsltNum, doRsltIdx);
}
please format your code properly. A possible set of instructions to do that are as follows:
- edit your post by clicking the pencil icon at the bottom of your post
- select the code
- click the
</>
button at the top of the edit pane
- save your edits
Its not really clear what the intent of your code should be, but this:
if (atomicAdd(doRsltIdx, 1) < *diRsltMaxNum)
{
atomicAdd(doRsltNum, 1);
}
is unlikely to be correct, in my view. For example if the intent is to limit the value in the doRsltIdx
location to no more than *diRsltMaxNum
the code won’t do that. The key reason is that the atomic op requested in the if
statement itself will always execute, whenever a thread reaches that point. As a result there is no limiting.
Thank you so much for your reply! My intention is to limit the value in the doRsltNum
location to no more than *diRsltMaxNum
. In my project, i may get a large number of calculation results in gpu, but i hope to deliver no more than 10000 resluts to cpu.
if (atomicAdd(doRsltIdx, 1) < *diRsltMaxNum)
{
doRslt[atomicAdd(doRsltNum, 1)] = rslt;
}
I am new to cuda, can you show me the proper way…
Please format your post properly.
1 Like
A simple solution would be to just atomicAdd without a check, and only perform assignment if the return value of atomic add does not exceeds the limit.
int position = atomicAdd(doRsltIdx, 1);
if(position < *diRsltMaxNum){
doRslt[position] = rslt;
}
This will never write more than diRsltMaxNum elements. In the end, to find the total number of values written you need to clamp the value manually.
resultsize = min(*doRsltIdx, *diRsltMaxNum)