[SOLVED] Cost of intrinsics for type conversion

I have some type conversions to do in a kernel function and one of them is inside a loop.
For this I use the intrinsic functions (https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__CAST.html#group__CUDA__MATH__INTRINSIC__CAST), but haven’t found information on their estimated cost. In CPU, for example, a conversion from float to int is expensive.
Do you guys have any hint on that, or should I just assume that it is no different here?

Check out the bottom of the table in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions.

Thank you!