Volatile keyword?

Ok so i have this kernel function

[codebox]__kernel void CardanoTriple( __global volatile ulong * count )

{

uint a = get_global_id(0) * 3 + 2;

uint b = get_global_id(1);

if ( a + b <= 1000 ) {

	ulong top = a + 1;

	top *= top * ( 8 * a - 1 );

	ulong down = b * b * 27;

	if ( top % down == 0 ) {

		if ( (top / down) + a + b <= 1000 ) {

			(*count)++;

		}

	}

}

return;

}[/codebox]

I run it in an NDRange 333,1000 global.

It calculates cardano triples. Every time I run the program, I get a different result. Usually somewhere between 100-130. Now, I dont know if I am mistaking the definition of volatile or its a bug or what. Oh I’m also using the C++ wrapper not that it should matter but who knows. I should try with a C version just to be sure. Now that I think about it. I dont think ++ and volatile works because it has to make 2 refrences one to get and one to set … hmm. Maybe…

Ok after the realization of ++ not being a atomic instruction … anyway to get what i want? (found the atomic extensions >.> … im retared)

[codebox]#pragma OPENCL EXTENSION cl_khr_int32_base_atomics : enable

__kernel void CardanoTriple( __global uint * count )

{

uint a = get_global_id(0) * 3 + 2;

uint b = get_global_id(1);

if ( a + b <= 1000 ) {

	ulong top = a + 1;

	top *= top * ( 8 * a - 1 );

	ulong down = b * b * 27;

	if ( top % down == 0 ) {

		if ( (top / down) + a + b <= 1000 ) {

			atom_inc( count );

		}

	}

}

return;

}[/codebox]

Ok I ditched the 64 bit because I don’t know if it even exists. BUT now i get

prxas ptx input, line 150; error : Instruction ‘atom’ requires .target sm_11 or higher

Does this mean my drivers are out of date or that my graphics card is to old? … ill be very displeased if its the second

http://www.nvidia.com/content/cudazone/dow…ammingGuide.pdf

Your video card has to have Compute Capability at least 1.1 to be able to process “atom_inc” instructions.

oh poo >.> Just for the record is there an easy way to tell the features of a video card for things like Compute Capability version? (not going to matter now but you know…everyone gets a new computer eventually)

THANKS btw

When you run deviceQuery SDK sample, you get CUDA capability major/minor revision numbers printed. Same values could be get programmatically through cudaGetDeviceProperties() function: “major” and “minor” fields of a cudaDeviceProp structure will be set accordingly. Finally, CUDA Programming Guide contains (in the Appendix A) compute capability values listed for number of devices (albeit it would be good indeed if NVIDIA could include these values, prominently displayed along with number of processors, and say OpenGL/DirectX version supported numbers, in the corresponding device specifications on their Web pages.

At least there is some information about it.