Too many shadow rays generate null pointer execution on CPU in Optix 6.0

Dear All,

I am trying to implement a simple tracer for evaluating Phong materials with shadows.
Execution runs fine if only directional and spotlight shadows are enabled. It also works with only directional and point light shadows enabled.

If shadow ray launches are enabled for directional and both point and spot lights (all rtTrace code lines uncommented) however, Optix code runs into a null pointer execution on CPU during rtContextLaunch2D(…).
Exception thrown at 0x0000000000000000 in optixtest.exe: 0xC0000005: Access violation executing location 0x0000000000000000.

Commenting rtTrace for either results in a successful trace

I am working on Windows 10 CUDA 10.1 (tried 10.0 as well), Optix 6.0, NVIDIA driver 430.86 (tried several other drivers), NVIDIA RTX 2070 GPU.

I have the following in my code.

In closest hit program:

#include <optix_world.h>
#include <optix_math.h>
//...


#define EPSILON 0.00001f

struct PerRayData_shadow
{
        bool inShadow;
};

rtDeclareVariable(PerRayData_shadow, currentPerRayDataShadow, rtPayload, );

RT_PROGRAM void collisionAnyHitProgram()
{
	currentPerRayDataShadow.inShadow = true;
	rtTerminateRay();
}


rtDeclareVariable(rtObject, top_object, , );

rtDeclareVariable(uint4, lightCounts, , ); //Directional, spot, point.

rtDeclareVariable(float3, positionResult, attribute positionResultVector, );

//...
//Other declarations.
//...

__device__ void calculateDirectionalLightContribution(
	uint lightIndex,
	float3& ambient,
	float4& diffuse,
	float3& specular,
	float& shininess,
	float3& normal)
{
        //... fetch directional
        float3 VP = -directionLight.direction;
        float shadow = 1.0f;

	PerRayData_shadow shadowPrd;
	shadowPrd.inShadow = false;

	Ray shadowRay = make_Ray(positionResult, VP, 1, EPSILON, 3.402823466e+38F);
	
	rtTrace(top_object, shadowRay, shadowPrd);

	if (shadowPrd.inShadow)
	{
		shadow = 0.0f;
	}
        //...
}


__device__ void calculateSpotLightContribution(
	uint lightIndex,
	float3& ambient,
	float4& diffuse,
	float3& specular,
	float& shininess,
	float3& normal)
{
//... fetch spotlight from buffer, etc.
        float3 surfaceToLight = spotLight.position-positionResult;
	
	float3 VP = normalize(surfaceToLight);
	float shadow = 1.0f;

	
	float surfaceToLightDistance = length(surfaceToLight);

	PerRayData_shadow shadowPrdPoint;
	
	shadowPrdPoint.inShadow = false;

	Ray shadowRay = make_Ray(positionResult, VP, 1, EPSILON, surfaceToLightDistance - EPSILON);

	//TODO: Optix bug, if both point and spot shadow trace is enabled!
	rtTrace(top_object, shadowRay, shadowPrdPoint);
	if (shadowPrdPoint.inShadow)
	{
	     shadow = 0.0f;
	}
//...
}


__device__ void calculatePointLightContribution(
	uint lightIndex,
	float3& ambient,
	float4& diffuse,
	float3& specular,
	float& shininess,
	float3& normal)
{
//... Fetch point light from buffer, etc.
	float3 surfaceToLight = pointLight.position-positionResult;
	
	float3 VP = normalize(surfaceToLight);
        float shadow = 1.0f;

	
	float surfaceToLightDistance = length(surfaceToLight);

	PerRayData_shadow shadowPrdPoint;
	
	shadowPrdPoint.inShadow = false;

	Ray shadowRay = make_Ray(positionResult, VP, 1, EPSILON, surfaceToLightDistance - EPSILON);

	//TODO: Optix bug, if both point and spot shadow trace is enabled!
	rtTrace(top_object, shadowRay, shadowPrdPoint);
	if (shadowPrdPoint.inShadow)
	{
	     shadow = 0.0f;
	}
//...
}


RT_PROGRAM void collisionClosestHitProgram()
{
        //...
        //Material fetch here, not important.
        //...
        for (uint i = 0; i < lightCounts.x; ++i)
	{
		calculateDirectionalLightContribution(i, ambient, diffuse, specular, shininess, normal);
	}
	for (uint i = 0; i < lightCounts.y; ++i)
	{
		calculateSpotLightContribution(i, ambient, diffuse, specular, shininess, normal);
	}
	for (uint i = 0; i < lightCounts.z; ++i)
	{
		calculatePointLightContribution(i, ambient, diffuse, specular, shininess, normal);
	}
        //...
        //Write results, not important.
        //...
}

Is this known behaviour? Is there anything I can do about it?
Please let me know if you need more information. Thanks.

Best regards,
Attila Barsi

It’s not possible to analyze this with the given information. That would require a minimal complete reproducer in failing state.

It’s kind of unusual to run into an access violation on a nullptr in OptiX host code.
I would start with defining all the device functions with forceinline device
Make sure you have an exception program at each entry point and enable exceptions during debug to see if there are any OptiX exceptions thrown.

It’s generally bad for compilation and runtime performance to have many rtTrace calls inside the code.
It’s easily possible to handle different light sampling and evaluation routines separately while only using one rtTrace call for the visibility test.

Have a look into the OptiX Introduction examples on github where that is done with a bindless callable program per light type and a single array of “LightDefinition” structures to define all lights in the scene.

That path tracer is only sampling one of many lights for each diffuse hit with multiple importance sampling and is not implementing singular light types. Just take it as an example how to reduce the number of shadow ray rtTrace calls inside the code to a single one.

All necessary links here: [url]https://devtalk.nvidia.com/default/topic/998546/optix/optix-advanced-samples-on-github/[/url]

This is the light sampling per light type via a bindless callable program:
[url]optix_advanced_samples/closesthit.cu at master · nvpro-samples/optix_advanced_samples · GitHub
This is the single rtTrace used for all light types:
[url]optix_advanced_samples/closesthit.cu at master · nvpro-samples/optix_advanced_samples · GitHub

Dear Mr. Röttger,

thank you for your answer. Is there any way to supply you with a minimal complete reproducer? Is it OK if I PM you an archived (zipped) release binary + test scene?

I will try your suggestions. I think I have supplied exception programs, but I will double-check. I have not used Optix in a while.

Also, the point of the exercise here is to reproduce the current rasterizer rendering pipeline for non-linearly generated eye ray sets (e.g. for complex camera optics), so currently I cannot use a path tracer approach, this is a local lighting model.
I have 3 arrays of floats for the lights, but I can push them into the same buffer. It will be however harder to update the buffers between frames if I do this, even if I pin the whole thing on the host side if the light counts change during runtime. (e.g. for flickering lights, lights out of the range of influence, etc.)
Also for directional lights trace parameters are different from spot and point light parameters. I could add an arbitrary distance that measures the farthest point of the AABB from the light direction. Seems a bit contrived though.

Best regards,
Attila Barsi

If the reproducer is small (< 10 MB) you could attach it to an e-mail to OptiX-Help (at) nvidia.com.
Though do not name it *.zip or it gets blocked by our e-mail servers. Renaming to *.zi_ or using *.7z will do.

One other thing to try would be to not use bool inside any rtPayload. There was some issue with that reported recently:
[url]https://devtalk.nvidia.com/default/topic/1055743/optix/problem-with-turning-off-rtx-mode-on-gtx-1080/[/url]

Dear Mr. Röttger,

thank you for the tip, I have changed the payload to uint and now it works fine. I still have the original version ~17 megs compressed, if you need it for debugging the library, I can put it on some cloud storage and send you the link. Please let me know if this is the case.
Thanks again.

Best regards,
Attila Barsi

We’re not able to access all file sharing sites from within the office due to IT rules.
I’ve sent a temporary FTP account which allows to upload files to your forum account registration e-mail.

Dear Mr. Röttger,

thank you for the FTP account, I have uploaded a zipped archive renamed NVIDIA_Bugreport.zi_. You will find a readme.txt in the archive describing what to do to run the example. If you run into any trouble, please let me know. You will need to have the Visual Studio 2017 runtime and Cuda 10.1 installed to run the binary. Unfortunately I don’t have the time to backport this into an SDK example anytime soon, as I will be on holidays until the 22nd of July. I sincerely hope that it will help you debug the issue. Should I fail to do the backport, you should be able to do it from the CUDA files (materialforcollision.cu) included in the package on your own.
Best of luck to you on this adventure in hemipterology. :-)

Best regards,
Attila Barsi

Thanks a lot. I attached it to the OptiX bug report for investigation.