Sorry to state the obvious: Don’t use dynamic initialization of device, constant and shared variables. Just pre-compute the desired constants and stick them into your header file. I would suggest pre-computing the constants with higher than target precision for optimal results.
CUDA is not C++, it’s a subset of C++. A pretty substantial subset, but a subset nonetheless.