[Resolved] What is in the stack?

Yauda · October 1, 2015, 5:56pm

Hello,

I’m trying to understand what is stored in the stack in optix.

As I understand it, we set the stack size per context, and one stack is attached to each thread in the ray generation program. When a ray is launched, the thread carries with it the stack, which stores the ray’s payload.

I thought that, when we do a recursive ray-tracer for example, the stack overflow would occur because there would be too many payloads to keep in the memory. But right now, I have a program with a radiance ray that has a payload of float + 3 uint, and a shadow ray with only a float, and there is only one bounce. However, my stack needs to be bigger that 1024 to avoid a stack overflow. Surely, this is way more that just my two payloads.

So I wonder, what else is in the stack?
(I mean in general, not in my particular case. What is stored in the stack except the ray(s) payload(s) (if they are)? For example, do we also store information about the hits? about the scene tree? Do we keep track of which program called the current ray?)

Thanks for your help!

droettger · October 2, 2015, 7:59am

The stack is also used to save and restore live variables around function calls (e.g. rtTrace or callable programs).
That’s the background for one of the performance advice in the OptiX Programming Guide which starts with “Try to minimize live state across calls to rtTrace in programs.”

Yauda · October 2, 2015, 1:00pm

It makes more sense now, I see how it’s linked to the advice to use a minimal stack too.
Thank you!

Ziqi · June 28, 2018, 5:29pm

I am also having this stack issue. For my application, I want a new ray to be cast every time the original ray hits the geometry, until the accumulated distance achieves 100 meters. For a point in a scene with a large open area, achieving the 100 meters criterion is not hard. However, stack overflow happens when it comes to narrow corridors(given that each time limitted distance is accumulated). The maximum number of recursions I can achieve right now, with my ray casting recursion scheme(which I think will not reduce any more in size) is 38. I am currently using a GTX 1070, and am wondering if this recursion number can be increased using a better GPU, like 1080ti. I sincerely appreciate anyone who contributes to the answers to this question. Thanks!

droettger · June 29, 2018, 9:30am

If you only continue a single new ray for every hit point, there is no need to do this recursively at all.
You can do that with an iterative path tracer much easier if all you need is the accumulated distance along a path.

Please have a look at the OptiX Advanced Samples on github.com
The new OptiX Introduction examples show how to implement a small and elegant iterative path tracer step by step.
Links here: [url]https://devtalk.nvidia.com/default/topic/998546/optix/optix-advanced-samples-on-github/[/url]

For your use case, the optixIntro_04 is already enough to accumulate the traveled distance along a path using a brute force path tracer
The stack size requirement of that implementation is minimal. There is no recursive rtTrace call in that program.
You would just need to change the rtPayload structure to contain the distance and remove the fields you don’t need.

The accumulation done in that application might not even be required, depending on the reflection properties used in your algorithm.

Please work through the rest of the introduction tutorials as well for additional information.

BTW, this also answers your questions about the random number generation in the CUDA forum. All Monte Carlo sampling examples have an implementation of a simple random number generator. There is no need to seed a buffer with cuRAND.
Also if you need to have a relative time inside the device code you can use clock().
Search for the TIME_VIEW define inside the OptiX SDK example source code for examples generating a heat map view with that.

Ziqi · June 29, 2018, 8:45pm

Thank you for replying to me, Detlef. Your solution is great, and I am trying to follow your suggestion. However, originally, I used curand to generate random numbers. When I changed the structure to iterative, the curand functions cannot be used anymore in the program. I am wondering how that happens, and how to avoid that. Thanks.

droettger · July 2, 2018, 7:45am

I haven’t used cuRAND before.
What exactly wouldn’t work anymore when changing an algorithm from recursive to iterative on device side?
You would have the same number of launches and shoot the same number of rays.

Ziqi · July 2, 2018, 4:47pm

Thank you Detlef for following up. I managed to run the iterative ray casting. Previously for shooting the maximum number of recursive rays, I set the stack size to be very big. The large stack size was the exact reason why CURAND is not functioning. As I reduced the previous very big stack size to a small one(because it is iterative now, rather than recursive), the program ran with a much better performance, and I am now able to continue my task.

Although the problem is solved, I still have curiosity about how OptiX is implemented. The RT_Program is a macro of global. I am wondering how OptiX manages these kernel functions. Does it organize them into one big kernel function, using the PTX codes, or does it launch several kernels at different times? And how is the stack structure implemented on the GPU? I appreciate all your answers to these questions.

droettger · July 3, 2018, 8:12am

Here is a paper describing how OptiX worked in 2010.
[url]http://research.nvidia.com/publication/optix-general-purpose-ray-tracing-engine[/url]

Topic		Replies	Views
CUDA/Optix GPU Utilisation OptiX	5	2691	June 14, 2022
StackOverflow in OptiX 6 OptiX	7	910	June 14, 2022
How to understand and set the stack size ? OptiX	5	2994	June 14, 2022
In Optix, How much memory can a single optixLaunch allocate? OptiX cuda , optix	3	595	January 18, 2024
OptiX Time for Launch OptiX	9	1333	June 14, 2022
Wraps in Ray gen and how data is initially stored in the memory hierarchy OptiX	13	1026	June 14, 2022
Problems with setMaxTraceDepth OptiX	5	1187	June 14, 2022
optixTriangle: how to shoot rays to specific set of co-ordinates? OptiX	10	434	June 20, 2024
Forward ray tracing in Optix OptiX	7	2175	June 14, 2022
Minimizing stack size depending on user's application OptiX	8	432	June 4, 2024

[Resolved] What is in the stack?

Related topics