I have a simple kernel that converts an integer array to a float array: global void
IntToReal( float* fsig,
short* isig,
unsigned int max)
{
unsigned int id = threadIdx.x+blockIdx.x*blockDim.x;
if (id < max)
{
fsig[id] = (float)isig[id];
}
} // end
It works fine, but I found this _int2float in the programming manual and tried this:
IntToReal( float* fsig,
short* isig,
unsigned int max)
{
unsigned int id = threadIdx.x+blockIdx.x*blockDim.x;
if (id < max)
{
fsig[id] = __int2float_rz(isig[id]);
}
} // end
which also works. Why would I need _int2float? Is it better to use it? I get same results, and there doesn’t appear to be a difference in speed.
The C - style cast may not round to the same value depending on which compiler you use.
The CUDA function __int2float_rz() is defined to behave in a particular way. If you want round-to-zero then use the CUDA function __int2float_rz().
That you use nvcc for all kernels is not the point: it’s more general and a philosophical issue. If you use, say, GCC for some CPU side code and you care about how rounding behaves then you use GCC options to control rounding behaviour. You should say what you mean in the code, do not count on the compiler to DWIM.