AtomicExch Double values

Is this the correct way to use atomicExch on atomic doubles?

atomicExch( (unsigned long long int*)&device_largestElements[blockIdx.x], __double_as_longlong(shared_buff[threadIdx.x]) );

device_largestElements - > array allocated on global memory
shared_buff → array allocated on shared memory

Seems ok.
May I ask what you are trying to accomplish?

Trying to implement the PageRank algorithm on CUDA. The function using that method is a module which picks the largest elements from an array

Wouldn’t a reduction algorithm be a better solution for that (Unless of course you use those atomics just for bare control)?
Atomic operations serialise your code, may significantly reduce the performance, and most likely you won’t get much performance imrovement from running it on multiple processors.

I am using a reduction algorithm but the last step of the algorithm is writing the result to global memory and just for correctness sake i am using atomic operations