Porting APP from Optix 3.8 (32 bit) to Optix 6.5 (64 bit) : Need some help, please

I took it a step further by replacing the “selector” with a “group”.
However, I am now stuck with another problem.
The part of the program that calculates the shadows receives exceptions (see below) only when the pixel is not in shadow (miss).
If I don’t calculate the shadows or if I calculate the shadows BUT the pixel is in shadow everything works perfectly.
Reflections work, only with shadows I have problems.
For the “SHADOW_RAY_TYPE” I don’t define any program except “rtMaterialSetAnyHitProgram” and I never had problems (Optix 3.8.0). Should something be corrected/added for Optix 6?

Should I also define the “rtMaterialSetClosestHitProgram” for shadows rays ?

I attach an example image. The points in red are the ones that receive the exception.
Thank you very much

Unknown error (Details: Function “_rtContextLaunch2D” caught exception: Encountered a CUDA error: cudaDriver().CuEventSynchronize( m_event ) returned (700): Illegal address, file: , line: 0

OPENGL Render

OPTIX 6 Render

OPTIX 6 Render (no Shadows)

One more Info…

If I define this “rtMaterialSetAnyHitProgram” program, I get all pixel red,

RT_PROGRAM void any_hit_shadow()
{    
  rtIgnoreIntersection(); // Continue the search
}

If I define this “rtMaterialSetAnyHitProgram” program, I get the image below,

RT_PROGRAM void any_hit_shadow()
{    
  // This material is tatally opaque, so it fully attenuates all shadow rays
  ShadowRayData.Shadowed = true;
  ShadowRayData.ShadowColor = optix::make_float3(0.0f); 
  
  rtTerminateRay(); // Occlusion found
}

The part of the program that calculates the shadows receives exceptions (see below) only when the pixel is not in shadow (miss).

If you receive exceptions from OptiX, you should implement an exception program which captures these to find out exactly what the exceptions are about.
Exception programs in OptiX 6 are per launch entry point. Means you need to have as many as you have raygeneration programs in your shader pipeline.
Here is an exception program example:
https://github.com/nvpro-samples/optix_advanced_samples/blob/master/src/optixIntroduction/optixIntro_07/shaders/exception.cu
and how to enable it:
https://github.com/nvpro-samples/optix_advanced_samples/blob/master/src/optixIntroduction/optixIntro_07/src/Application.cpp#L435

If the exceptions you’re seeing are stack overflows, then that is most likely due to a missing change to the new API which is not using the old stack size in bytes anymore but is automatically calculated from a maximum number of recursions you must set since OptiX 6.0.0.
Please read this OptiX 6.5.0 Programming Guide section:
https://raytracing-docs.nvidia.com/optix6/guide_6_5/index.html#host#3128
and these API references:
https://raytracing-docs.nvidia.com/optix6/api_6_5/html/group___context.html#ga1da5629dbb8d0090e1ea5590d1e67206
https://raytracing-docs.nvidia.com/optix6/api_6_5/html/group___context.html#ga90c475595c0fc945a651a1de04a3d81d

When changing OptiX SDK versions please read all SDK Release Notes of the versions you have skipped as well to find the important changes.

The exception I get is this
Thanks

Unknown error (Details: Function “_rtContextLaunch2D” caught exception: Encountered a CUDA error: cudaDriver().CuEventSynchronize( m_event ) returned (700): Illegal address, file: , line: 0

It was not a stack overflow, those are displayed in fucsia, red color, in my program, means “generic exception”.

This is the exception callback

RT_PROGRAM void exception()
{
  const unsigned int ExceptionCode = rtGetExceptionCode();
  
  rtPrintf( "Caught exception 0x%X at launch index (%d,%d)\n", ExceptionCode, launch_index.x, launch_index.y );
  
  if(ExceptionCode==RT_EXCEPTION_STACK_OVERFLOW)
    output_buffer[launch_index] = make_color( GlobalSettings[0].BadColor1 );
  else
    output_buffer[launch_index] = make_color( GlobalSettings[0].BadColor2 );
}

Encountered a CUDA error: cudaDriver().CuEventSynchronize( m_event ) returned (700): Illegal address, file: , line: 0

Yes, that is a crash in CUDA which can happen for many reasons, including errors in your shader code, and cannot be analyzed by looking at images.

You should still set the maximum number of recursions correctly first and implement an exception program for debugging things.

Please always provide the following system configuration information when asking about OptiX issues:
OS version, installed GPU(s), VRAM amount, display driver version, OptiX (major.minor.micro) version, CUDA toolkit version (major.minor) used to generate the input PTX, host compiler version.

Do the OptiX SDK 6.5.0 examples run on your system?

Should I also define the “rtMaterialSetClosestHitProgram” for shadows rays?

If the shadow rays only test visibility, you don’t need a closest hit program for them. The anyhit program is all you need.
When you do not use cutout opacity in your scene, then a faster method is to use only a miss program and the RTrayflags RT_RAY_FLAG_DISABLE_ANYHIT | RT_RAY_FLAG_DISABLE_CLOSESTHIT | RT_RAY_FLAG_TERMINATE_ON_FIRST_HIT
Explained here with OptiX 7 terms: https://forums.developer.nvidia.com/t/anyhit-program-as-shadow-ray-with-optix-7-2/181312/2

Note again that your use case of selectors can be replaced with ray visibility masks alone and you should be able to convert your program to OptiX 6 or better OptiX 7 without loss of features.

So what is the exact RTexception code in the failing cases then?
Everything else than stack overflow can still be 11 other reasons.

I use Optix in a Delphi application (not C++).
I have no possibility to add a Consolle to my APP.
Is there a way to get back the strings without the consolle? (I have read the documentation but no trace to do that)
Thanks

Is there a way to get back the strings without the console?

Not that I know of. The rtPrintf goes to the standard out stream.
You can also use CUDA’s native printf function instead which goes to some host-stream, but I don’t know how to set that stream either.

The pragmatic approach then would be to implement a switch-case for all RTexception enum values and either write a different color or use an additional integer output buffer and directly write the exception code per pixel.

I’ll do more tests
Thanks
AD

…and the winner is…

RT_EXCEPTION_PAYLOAD_ACCESS_OUT_OF_BOUNDS

:-)

I was able to pinpoint the instruction that generates the exceptions…
If I comment “rtTrace” no exceptions, If I return false (not shadowed) or true (shadowed) the program works anyway (non red dots)

static __device__ __inline__ bool TraceShadowRay(
        optix::float3 Position,      // Ray starting point
        optix::float3 Direction,     // Ray shooting direction
               float  MaxDistance,   // Ray max allowed distance
        optix::float3 &ShadowColor)  // Output shadow color (1.0,1.0,1.0) => No shadow, (0.0,0.0,0.0) => Complete shadow            
{		
    float  SceneEpsilon = GlobalSettings[0].SceneEpsilon;
   
    TShadowRayData ShadowRayData;
    // Initialize values
    ShadowRayData.Shadowed = false;
    ShadowRayData.ShadowColor = make_float3(1.0f);
    // Create Optix Ray
    optix::Ray ShadowRay = optix::make_Ray( Position, Direction, SHADOW_RAY_TYPE, SceneEpsilon, MaxDistance + SceneEpsilon );
    // Lunch Ray 
    //rtTrace(top_object, ShadowRay, ShadowRayData);
    // Get results
    ShadowColor = ShadowRayData.ShadowColor;
    // done	
    return(ShadowRayData.Shadowed);
}

That’s most likely a known issue with bool types in payload structures.
We’ve seen this before:
https://forums.developer.nvidia.com/t/problem-with-turning-off-rtx-mode-on-gtx-1080/76550
https://forums.developer.nvidia.com/t/freeze-on-sync-after-launch/159425/4

Please try changing your TShadowRayData Shadowed member from bool to int or unsigned int and use 0 and 1 to set it instead.

No change… (red dots again)

Nothing I can do about that with the given information.

I neither know your system configuration, nor do I have any means of reproducing or experimenting with this.
Again, please always provide the following system configuration information when asking about OptiX issues:
OS version, installed GPU(s), VRAM amount, display driver version, OptiX (major.minor.micro) version, CUDA toolkit version (major.minor) used to generate the input PTX, host compiler version.

The PTX code generation depends on the CUDA toolkit. I would recommend CUDA 10.1 or 10.2 for OptiX 6.5.0.
The microcode generation is driver dependent.
Means all these above system configuration information are required to reduce the turnaround time on OptiX questions.

Did you set the recursion depth to let OptiX calculate the correct stack size?
Do you have any other payload structures with bool types in them? Replace them as well.
As said in the second link above, I would also recommend analyzing all your structures to have members placed at their native CUDA alignment restriction offsets as well to avoid unnecessary padding or bugs.

Cuda version: 11.4
Optix library version: 6.8.1
Device name: NVIDIA GeForce RTX 3060 Laptop GPU
Windows 11 Pro (21H2)
NVIDIA System Information.txt (3.8 KB)
Thanks

rtContextSetStackSize(Self.FContextHandle,8192);

Your OptiX version is from the driver query, I assume the CUDA version as well?

I meant the OptiX SDK version and that is 6.5.0.
If you’re actually using the CUDA 11.4 version to translate your *.cu files to *.ptx files, please try using the CUDA Toolkit 10.1 or 10.2 for OptiX 6.5.0 instead.
Always read the OptiX Release Notes before setting up a development system.

rtContextSetStackSize(Self.FContextHandle,8192);

As explained before with links to the respective programming guide chapter, that function does nothing in the OptiX 6.5.0 RTX execution strategy.
You must use rtContextSetMaxTraceDepth instead (and rtContextSetMaxCallableProgramDepth when using callable programs).

There are also 11 newer display driver versions available for your system if you’re not getting this to work.

Yes

As explained before with links to the respective programing guide chapter, that function does nothing in the OptiX 6.5.0 RTX execution strategy.You must use rtContextSetMaxTraceDepth instead (and rtContextSetMaxCallableProgramDepth when using callable programs).

I’ll try…

P.S. The “transparency rays” works well, only shadows rays fail…

Here an example (shadow off)

I finally solved ( sort of… )

The problem is the call “rtContextSetEntryPointCount(Self.FContextHandle,CAMERA_COUNT)”

In my program I use two cameras (“pinhole_camera.cu” and “radiosity_camera.cu”). The first is the standard Optix implementation the second is a “buffer based” implemetation that I use to calculate “per vertex” radiosity

If I call rtContextSetEntryPointCount(ctx,2) I get the error (red points)
If I call rtContextSetEntryPointCount(ctx,1) I get no error (no red points)

Strange…

AD