I’m using a gtx 1060 compute capability 6.1, visual studio 2015 and cuda 8.0.
I read all topics in this section and on the others sites of the web, but nothing helped me.
In my solution project proprieties, under CUDA C/C++ → Device → code generation i set compute_61,sm_61.
The follow partial code doesen’t compile if I decomment the atomicAdd( ) instruction
// PHASE 2: perform iterative kogge_stone_scan of the last elements of each subsections of XY loaded first in AUS
AUS[threadIdx.x] = XY[threadIdx.x * (SECTION_SIZE / blockDim.x) + (SECTION_SIZE / blockDim.x) - 1];
for (unsigned int stride = 1; stride < blockDim.x; stride *= 2) {
if (threadIdx.x >= stride) {
//atomicAdd(AUS[threadIdx.x], AUS[threadIdx.x - stride]);
}
}
__syncthreads();
If I try to decomment I have the follow error: identifier “atomicAdd” not defined
And if I try to recompile after I have the follow errors: no instance of overloaded function “atomicAdd” matches the argument list work-efficient_parallel_scan
In general, when asking for debugging help, it is best to post a minimal but complete example that other people can compile themselves.
Think about it this way: You are able to study the entire code, compile it, instrument it, etc, yet you have not been able to determine what is wrong. Logic says that it is unlikely that other people who have access to only a minimal snippet of that code and can’t compile it (i.e. have significantly less information than you) can tell you what is going on.
Note that this will NOT get rid of the red underline (“identifier “atomicAdd” not defined”) under atomicAdd, that is an intellisense issue. But you should be able to compile a properly crafted atomicAdd statement, even though there is a red underline.
Sorry njuffa you’re right, but I thought that the problem was about the configuration of visual studio project, so I didn’t paste the entire kernel code.