I’m running into some trouble compiling atomicCAS using unsigned short int’s.
The online documentation (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#atomiccas) clearly shows that the 16 bit type should exist (which surprised me because it would be the only atomic method working with 16 bit types) but it doesn’t compile (No instance of overloaded function “atomicCAS” matches the argument list). The 32-bit and 64-bit version compiles problem-free.
I also noticed that the addition of the 16-bit variant is fairly new (doesn’t appear in the CUDA 8 manual) so I was just wondering if this is maybe a mistake or only available on later compute capabilities. There’s however nothing about it the compatibility table.
I’m compiling with CUDA 10.1 using a P5000 (compute 6.1) with VS2015
Any help is appreciated :)
PS: The method is also not listed in device_atomic_functions.hpp.
I should actually also mentionned that the kernel is called using dynamic parallelism