Am I correct in understanding that __fdiv_ru(x,y) returns the value of (x/y) rounded up to the next largest integer? It talks about rounding, but one can ‘round’ to different locations. For example I assume that it will return 5.0f for values between 4.0 (exclusive) and 5.0 (inclusive), or will return -5.0f for values between -6.0 (exclusive) and -5.0 (inclusive).
The result is rounded up (towards positive infinity) to the next representable floating-point number. CUDA offers intrinsics for basic arithmetic with all four IEEE-754 rounding modes. This is useful for people doing interval arithmetic, for example. For a brief introduction to IEEE-754 rounding modes, see Wikipedia, for example: