float number error

I’m intersecting a cylinder. For the same code,
python can find
t1= -0.00044514510532 t2= 0.0694807350249
t3= -1179.14408098 t4= 127.893641106
tmin= -0.00044514510532 tmax= 0.0694807350249
but optix gives

They give different results mainly because I copied ray.origin and ray.direction from Optix print(%.4e) as python input.
I did math by hand and knew python gave correct answer.
Any idea to make optix work correctly as well?

Are you using fast maths while compiling your OptiX kernels?

I switched between using and not using fast math. That doesn’t change significantly.
I just tried separating the intersection program to an independent C program. Then I could get acceptable results at least t1 is negative.

The only difference is I didn’t use the vector functions like dot(), normalize() provided in sutil.c by Optix SDK example. However, these functions in sutil.c look very similar to what I wrote for the ‘independent program’. Kind of weird…

Another question out of curiosity: the vector functions provided in sutil.c are not declared as device. How can that work? I written the makefile myself and didn’t compile sutil.c differently from usual cases.

The file sutil.c contains utility methods for the host. The device versions of dot(), normalize(), etc. are contained in .h files that have the previx “optix_”.

Another weird thing:
Though the independent C program works out well, incorporating that into Optix fails again.
I stored the float3 ray.origin and ray.direction in float[3] and use the same vector function I wrote for the “independent” C program.

Python floats are always 64 bits (i.e. double precision) floats. It’s not surprising to see a difference between C code with simple precision floats and Python code with double precision floats.

Is there actually a difference between your program with use_fast_math and without use_fast_math? Also, what is your OS/compiler/OptiX version… ?
I’m asking you this because I had a problem once before that was linked to the unability for Visual Studio to interpret correctly the compilation options and to actually use the use_fast_math option when asked.

By the way, what kind of functions are involved in your intersection code? Only arithmetic ones or trigonometric functions too?

I started the post with difference between python and C. As you can see from 3rd post, I used an “independent” c program doing the same thing as python and giving the same result as python.

My intersection program is for cylinder and involves sqrt(). But the cylinder position is calculated from sin() and cos().

Currently I’m not using use_fast_math. When I was “switching”, i just add the use_fast_math flag, should I change sqrt() to sqrtf() as well?

OS:Linux 3.5.0-51-generic #76-Ubuntu 2014 x86_64 x86_64 x86_64 GNU/Linux
nvcc: Cuda compilation tools, release 6.0, V6.0.1

From the sentence “Then I could get acceptable results at least t1 is negative.”, I assumed that the results provided by the independent C program were still different, which I would have found normal if you were using simple precision floats in the C code.

I generally use sqrtf, cosf and sinf while working with simple precision floats. As mentionned in the OptiX programming guide, it prevents from having useless conversions from double to float and vice versa.

The use_fast_math option will, for example, use __cosf instead of cosf, which is faster but less accurate. In Visual Studio, I have not found any other solution than manually specifying the use_fast_math option for the moment to get it to really use this option.
But anyway, that’s definitely not what you’re looking for here.