I am working on a ray-tracing-in-grid-space code. for every step, I need to propagate my particle to the first intersection to the enclosing voxel, and ensure that the particle is out-side of the voxel (so, when taking floorf(x/y/z) it will be a different integer).
I am having difficulty for the second goal, i.e., make sure the coordinate is across integer boundaries.
here is my code snip
#define EPS 1e-6f
// float3 p is the particle position, float3 v is the velocity vector,
// id is the index of the first intersection plane id={0,1,2}: passing grid in {x,y,z}-dir
// at this point, p is already updated and should be located ON the boundary
// the next line ensures it is across the boundary
((float*)p)[id]=__float2int_rn(((float*)p)[id])+((((float*)v)[id] > 0.f)-0.5f)*2.f*EPS;
I found that, when I set the scaling factor before EPS to 2.f as above, the code may fail, but if I set it to 4.0, it works for most particles I tested.
I am wondering how to do this rigorously? is there an FLT_EPSILON for float in CUDA?
For all platforms supported by CUDA, both GPUs and CPUs use IEEE-754 floating-point formats. Therefore you should be able to simply
#include <float.h>
and use the FLT_EPSILON defined by the host compiler’s header file. That is the mechanical part. I am not exactly sure what your computation is trying to accomplish, so I cannot judge whether the above computation when used with FLT_EPSILON will provide the desired results.
Maybe you could give a more detailed specification of the desired functionality. I understand that the boundaries of each cell are on integers. Can points ever lie on a cell boundary? Looking at the one-dimensional case, what should the result of the computation be for the following three basic cases (where ‘x’ and ‘delta’ are floating-point numbers and ‘n’ is an integer representing the lower bound of the interval defining the cell in that dimension:
[1] x == n // point coincides with lower bound
[2] x == (n+1) // point coincides with upper bound
[3] x == (n+delta) // point somewhere inside the interval, that is ‘delta’ is in (0,1)
To increment a floating-point number by the smallest possible finite quantum, I would suggest using the C/C++ standard function nextafter(x,y) provided by CUDA. Starting at ‘x’, this returns the next representable floating-point number in the direction of ‘y’. Commonly used choices for ‘y’ are zero, negative infinity, positive infinity.
this is the puzzling part. the IEEE-754 defined FLT_EPSILON is around 1.19209290e-7F, but the EPSILON found through trial-and-error in my cuda code was 2e-6. I can’t explain why they are different. is that because I multiplied it by -1.f or 1.f computed through the following term?
here I have a simple example: let’s set p={20.5,20.5,0.5} with a dir vector v={-1,0,0}. To perform a ray-grid ray-tracing by comparing the distances to x/y/z planes, I found the first intersection point at pn={20,20.5,0.5}.
if I return pn directly without changes, my next ray-tracing test will be a problem, because it will hit the exactly the same location {20,20.5,0.5} again because its distance along v is the shortest, i.e. 0.0.
to avoid infinite loop, I’d like to move pn slightly over the intersection so that next time, it will find {19,20.5,0.5}, and so on. The problem I am struggling is by how much I should change pn.