Options for sharing cuda code between program groups / pipelines

From the topic titled “How does optix code compilation work?”:

I had a follow up question on this point. If we take C++ compilation as an example, we can generate .obj files from .cpp files, which can then be linked together with symbol resolution to generate a final executable. A similar pattern for an optix pipeline might be to generate an optix module containing various shared utility functions and just supply that module when generating the final pipeline. Your comment leads me to believe that this may be possible but I’m not seeing how to do it using the current APIs. Hopefully I’m overlooking something. The issue is that a pipeline only takes ProgramGroups (not modules) as inputs, and for each ProgramGroup entry point it seems like you can only specify a single module.

Up until now, I’ve been inlining all shared functions so they get (conceptually) duplicated in each program group, so I’m wondering if there a way to do this without inlining, using a form of symbol resolution, or is inlining the recommended approach for code sharing between optix program groups / pipelines?

(Also noted is that another potential way to share code is via direct/continuation callables. I’m ignoring that option for now since it seems to require additional programmatic setup code for each function you want to share.)

Hi Mark,

I’m seeing a few different issues in your question that could be answered separately.

First, code that calls any OptiX device functions (prefixed with “optix”) is always going to be inlined. If your utility code has no dependences and doesn’t call OptiX device functions, then it can probably be shared, but otherwise inlining is the only choice.

Callables are the primary mechanism we offer for calling shared code without inlining it, and you’re right they do require additional setup. This might not be appropriate for small utility functions, but it does address the question you raised about how to include the shared code in the OptiX Pipeline.

Sharing vs inlining are two potentially different things. You might want both, which is the main reason our SDK examples share utility code by duplicating the definitions. Inlining usually benefits code execution speed at the expense of compile time, so we tend to recommend inlining unless compile times are problematic, even for shared utility code. While this next suggestion doesn’t address the issue of how to include shared code outside of a ProgramGroup, I believe you can declare device functions you’d like to share as extern, and refer to them across modules via having a forward extern declaration in the module that does not have the function’s definition. You could elect to compile the definitions of shared code in one of the modules needed by your program groups, and refer to them from the other modules. But be really careful with this, since it could slow down your code dramatically, it’s not typically a great solution for sharing a lot of small utility functions. Instead of being inlined, the extern-declared functions will each turn into a function call (even in the same module) that will use extra registers and can lead to register spilling. If utility functions call other utility functions, you could accidentally end up with an expensive recursive call chain. On top of this, preventing inlining also prevents optimization across the call boundary, which can do ugly things like force constants into memory that used to be compiled out. Getting the behavior you want, and finding the right balance between inlined and not-inlined code might actually be easier with the explicit control that callable programs offer, despite the extra work it takes to set them up.


David.

Thanks for the detailed explanation David! That clears things up. I mainly wanted to check I wasn’t missing out on another mechanism for code sharing that I wasn’t aware of. I hadn’t explored callables much before, but it’s time to take a closer look since they do sound like they fit the bill for our use case.

Mark