operations on 64-bit unsigned long long int arrays

I’m a newer to CUDA. And now I have to use CUDA to sort an arrya of 64-bit unsigned long long int type
and I found that cudpp only supports 32-bit type. Also I found the sentence " 64-bit words are only
supported for global memory" in the programming guid.
So does it mean that some operations are not supported in the kernel function for the shared or rigister
memory 64-bit words? And would you please give me some advice on sorting 64-bit words in CUDA?
Thanks very much!