I am trying debug my program with “–device-debug” and the new optix-ir format. However it’s using all my CPU memory before launching and crashing the program.
For context compiling with “–generate-line-info” uses about 4.8GB and with “–device-debug” it maxes out my system’s 16GB. I realize that’s not a lot of system memory (relatively speaking) but it goes up quite quickly, is this to be expected?
Is there any way I can reduce the memory size? My kernel is quite large and running a reduced version seemed to go through better.
This extra CPU memory is consumed while running your app, not compiling, right? Which OS is this, Windows or Linux?
Probably the first thing I would suggest doing is tallying the total memory of the input buffers you’re passing to
optixModuleCreate(), to verify that all of the memory consumption is coming from the intermediate compiled code.
Assuming that all the memory is coming from debug information in your compiled code, it seems likely that inlining in your code (combined with verbose debug info) could be causing the unexpected memory bloat. If so, there are a couple of options for reducing inlining. You can use OptiX callables to prevent inlining of functions that call OptiX device functions, or you can use the
__noinline__ declaration to prevent inlining of functions that have no OptiX calls in them. You might be more likely to need to use callables, I’d guess, just because this is happening in debug builds where OptiX is more aggressive about inlining functions that call into OptiX than helper functions that don’t make any OptiX calls.
We’d also be happy to look at your OptiX-IR privately and see if there are bugs or data in it that could be removed, if you’d like. And if the memory usage is not coming from the intermediate code but from something else, then that could be a bug and we should get more details so we can fix it.