Atomic operation Getting atomicAdd support


I am using CUDA1.0 and i want to do some atomic operation in a memory location at global area. My display card is 8800GTS. I changed the custom build setup to

(CUDA_BIN_PATH)\nvcc.exe -arch sm_11 -ccbin "(VCInstallDir)bin" -c -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/Wp64,/O2,/Zi,/MT -I"(CUDA_INC_PATH)" -I./ -I../../common/inc -o (ConfigurationName)\template.obj

But the program is giving some different output than what i expect.

global void
testKernel( int* g_odata)
// Block index
int bx = blockIdx.x;
int by = blockIdx.y;

// Thread index
int tx = threadIdx.x;
int ty = threadIdx.y;
int nBlocksize = 16;
int nStart = bx * ceil((float)65536/nBlocksize)  + tx * ceil((float)((65536/nBlocksize)/nBlocksize));
for( int i = nStart; i <= nStart+ceil((float)((65536/nBlocksize)/nBlocksize)); i=i+1 )
   g_odata[0] = 1.0f; 


runTest( int argc, char** argv)
int* pCpuOutData = (int*)malloc( 256256sizeof(float));
int* pOutData;
CUDA_SAFE_CALL( cudaMalloc( (void**) &pOutData, 256 * 256 * sizeof(int)));
dim3 grid(16,1);
dim3 thread(16,1);

CUDA_SAFE_CALL( cudaMemcpy( pCpuOutData, pOutData, 256 * 256 * sizeof(int),
                cudaMemcpyDeviceToHost) );    

CUDA_SAFE_CALL( cudaFree(pOutData));   
free( pCpuOutData );


The output is some junk value like 11731320…

Please help me.

Alas, the 8800GTS doesn’t support atomic operations, you need a card with compute model 1.1 for that, like the 8600 (G86) or 8800GT (G92)

is there any table which specifies the card number like 1.1 or 1.0? if so where?

The CUDA FAQ includes a list of supported GPUs and their compute version:

We will be updating this for CUDA 1.1 shortly.