A question on data size

Hi all.

Can I get some hint on my CUDA program?

My computer is shutdown when I run below program.

int left, right;
int idx = blockIdx.x * blockDim.x + threadIdx.x;

for(left = idx; left < dataCount; left += threadCount){
for(right = left+1; right < dataCount; right++){
if(dominantLeft(data, left, right, dataCount, dimension)){
flag = DOMINATED;
}
else if(dominantRight(data, left, right, dataCount, dimension)){
flag = DOMINATED;
}
else{
}
}
}

The code works fine only when the dataCount is less than 102400.

The code quite simple to give problems…but I don’t know why…

I am running the program on Windown 7 + VS2010 + CUDA4.0.

Are there anyone who have similar experiences?

My guess is the compiler tries to do some optimizations and uses too many registers. Try using the maxregcount flag or launchbounds. What about the the launching command? Are using the appropriate amount of blocks and threads per block?