I recall that, if one says `abs(x)`

the CUDA compiler will insert the proper absolute value function for the data type of `x`

, e.g. for `int x`

the result will be `iabs(x)`

. Is the same true for `sqrt(x)`

and `log(x)`

if I am supplying `float`

(`float32_t`

) or `double`

(`float64_t`

) arguments, or am I required to submit `sqrtf(x)`

and `logf(x)`

in order to get the `float32_t`

equivalents of these functions? I was working through an old code and I noticed that it uses `sqrt`

and `log`

even though the arguments and the variables that will hold the results are all `float32_t`

.

CUDA is derivative of C++, so `sqrt()`

is overloaded. Depending on whether a `float`

or a `double`

is passed, this resuts in either a single-precision square root or a double-precision square root.

If you pass an `int`

to `sqrt()`

you *may* get a compilation error that no matching version can be found among available overloads. I am saying *may* because I forget whether (and if so, when) C++ added `int`

overloads for standard math functions. If `int`

arguments are supported (e.g. `sqrt(5)`

), you would want to double-check whether `sqrt(i)`

is defined to be equivalent to `sqrt((double)i)`

or `sqrt((float)i)`

if you need to keep code â€ś`float`

cleanâ€ť.

C++ adopted the C99 standard math library in wholesale fashion at C++11. If you check the ISO C++ standard document you wonâ€™t find any details about standard math library functions in it. So `sqrtf()`

still exists, and you can pass `int`

arguments to it without issues: `sqrtf(i)`

is equivalent to `sqrtf((float)i)`

.

Using the overloaded plain name math functions is probably generally preferred in CUDA code just like it is in general C++ code. Personally I still prefer to explicitly use the `f`

-suffixed versions of standard math library functions when I write â€ś`float`

cleanâ€ť code.

Yes, I also prefer to use the explicit `f`

when writing code for all-float. It may be numbers, not functions, but all you need is one 1.234 without specifying it as a float to have your whole right-hand side go `double`

! Thanks for the clarification. I can now rest easy that the old code is indeed a lot slower than the new one, and not because itâ€™s accidentally using a lot of `float64_t`

intrinsics.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.