Volatile - when to use? (Regarding registers)

phw_89 · April 13, 2011, 2:51pm

I’m aware of the ‘volatile trick’ to reduce register usage in certain situations.

However, I’m slightly confused as to when using the volatile keyword will increase performance, and when using it will decrease performance. (Assume we have 100% occupancy already).

My understanding is the volatile keyword forces the variable to be stored in a register and then every time the variable is used in the code it is fetched from the register. If a variable isn’t declared volatile then it may or may not be inlined or stored in a register. Is this correct?

Basically, I’ve seen people saying that using volatile on every variable will result in a performance hit. How do I know which variables will benefit from being volatile?

Is it the ones that are used the most? If so, how many times does the variable have to be used to gain any benefit from being volatile?

Or will variables that are computed from a global read (i.e. float tmp = A[i]; where A is a global array) always benefit from being volatile?

Lot’s of questions I know External Image There just doesn’t seem to be much concrete info on this…

wlangdon · April 13, 2011, 3:25pm

hmm that not how I read it. I think volitile tells the compiler it is not to optimise
variables (especially shared memory) by placing them in registers because another thread
may update the variable. (The update would be ignored if the register was used instead).
I ended up using volitile on all shared memory because I could never be sure that replacing
it with a register was safe (after all I was using shared memory inorder to communicate between
threads). In my view any small performance gain by risking the compiler doing the wrong thing is not worth the debugging effort. Use volitle on all pointers to shared memory.
Bill

sidxavier · March 15, 2012, 5:59pm

Suppose I am implementing a global worklist - I add to that worklist using the list tail and read (delete)from work list using Head of list.

None of my threads re-read any particular location of list however I do re-read Tail and Head values repeatedly. Should I be define these global variable (tail and head) as volatile?

In that case on the CPU code side - how do i use cudamemcpy or Atomics on them -? By type casting?

Sid.

silbmarks · March 17, 2012, 8:46pm

volatile should be used when the data can be changed outside the current thread without memory fences (for writes) or synchronization (for reads and writes). Otherwise the compiler is free to optimize the reads/writes to the variable by caching the data in a local register. In particular if you access your shmem data only after __syncthreads there is no need to use volatile.

In your example, if you synchronize the accesses to the list via some global atomic variable, and then change the tail/head from different threads without __synchtreads, you MUST declar head/tail as volatile since otherwise the updates will not be properly read by other threads.

If you worklist is on a GPU and you modify the items from the CPU, then it’s a whole different story and volatile is insufficient, you have to use PTX assembly to read through to bypass the cache.

Topic		Replies	Views
Tricks to fight register pressure or how I got down from 29 to 15 registers. CUDA Programming and Performance	14	16378	March 14, 2022
Improve performance using volatile CUDA Programming and Performance	8	7049	July 15, 2011
Volatile Keyword What exactly does it do? CUDA Programming and Performance	3	2350	February 17, 2009
Using volatile What have i actually done? CUDA Programming and Performance	12	4151	August 13, 2008
volatile breaks coalescing for vector types volatile trick can backfire. CUDA Programming and Performance	7	2055	November 4, 2009
Question about shared memory usage How to use as reg and volatile effect CUDA Programming and Performance	11	6883	November 28, 2007
Force flush to global memory on grid level in cooperative kernels CUDA Programming and Performance	5	1251	August 13, 2019
Does the volatile keyword work properly on global memory CUDA Programming and Performance	4	1330	August 17, 2022
warp synchronization test CUDA Programming and Performance	5	1656	September 2, 2014
Force a variable to be stored in a Register Is there any way to ensure a variable CUDA Programming and Performance	13	8986	May 21, 2010

Volatile - when to use? (Regarding registers)

Related topics