optixModuleCreate throws error compile error only in debug

We have a larger project and recently switched from OptiX 7 / PTX to OptiX 8 / optix-ir.
Ever since, we cannot run the project in Debug mode in MSVC anymore. The optixModuleCreate function will throw a runtime error:

COMPILE ERROR: Malformed input. See compile details for more information.
Error: Taking the address of functions is illegal because indirect function calls are illegal.

The stack trace it gives has no address function usage whatsoever. The function it complains at is a utility class constructor, similar to this:

template<typname T, int N>
class container {
   container() = default

__host__ __device__
container(std::initializer_list<T> v) {
public:
    std::size_t i = 0;
    for(const T &val : v){
        values[i++] = val;
    }

private:
    T values[N] = {};
}
};

used by a default constructor in a different class

class foo {
public:
    foo() = default;

// contains members and other ctors for member setting, but compile error points to foo() = default; line
};

and on top of the stack there is just an object creation in an optix file

struct ClostestHitContext {
// ... stuff that doesn't matter
    __device__
    Ray SpawnRay(const Point3f &p) const {
        foo f(p1, p2); // erroneous line, goes to the foo() = default; ctor at some point
        // ... other code
    }
};

I cannot give out more code for this, but according to the error that is all that is failing.
Again, this happens only in Debug mode in Visual Studio (we use 2022). In both Release and RelWithDebInfo, it works fine.
On Debug we use

    moduleCompileOptions.optLevel = OPTIX_COMPILE_OPTIMIZATION_LEVEL_0;
    moduleCompileOptions.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_FULL;

and on Release

    moduleCompileOptions.optLevel = OPTIX_COMPILE_OPTIMIZATION_LEVEL_3;
    moduleCompileOptions.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_NONE;

I cannot easily try a Debug build with the Optimization settings (i.e. C++ Debug, but OptiX optimized) due to our cmake setup

CMAKE_CUDA_STANDARD is set to 20, it happens with different CUDA versions (I tried 12.2, 12.4 and 12.5) among different graphics cards (RTX A1000 Laptop, RTX 4070) and different driver versions (555.99, 552.74, 552.12).

Is there anything (obviously) wrong with this setup?

Thanks for the report.

There is at least one known problem inside the currently released drivers which could result in that behavior with debug device code generation (nvcc option -G) and OptiX-IR module input.

That should not happen without the -G option, so only using -lineinfo should work, but then the debugLevel might complain when using OPTIX_COMPILE_DEBUG_LEVEL_FULL.

It should be solved inside one of the upcoming R560 driver releases. (I don’t know the exact version number containing the fix at this time.)

I cannot easily try a Debug build with the Optimization settings (i.e. C++ Debug, but OptiX optimized) due to our cmake setup.

Depending on how you’re setting the nvcc command line options for your OptiX device code translation, that shouldn’t be impossible.
I’m doing that all the time inside my OptiX examples.

I either use custom build rules for each OptiX *.cu file where I can hardcode the command line parameters to be the same for all build targets, where I usually never use debug device code option -G because that is super slow.
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo12/CMakeLists.txt#L210
Using this CMake macro:
https://github.com/NVIDIA/OptiX_Apps/blob/master/3rdparty/CMake/nvcuda_compile_module.cmake

Or I use standalone CMake projects with the CMake native LANGUAGES CUDA feature where the OptiX device code is built as a CMake “Object Library” which is just translating *.cu files to PTX or OptiX-IR and can have its own nvcc command line options, even per build target in a multi-target build system (MSVS), shown here:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/GLTF_renderer/CMakeLists.txt#L269

Also see links in this post for these CMake setup explanations for OptiX application frameworks.

Thanks for the input!

We found one “issue” in our code: You see that SpawnRay uses Point3f. We also have a Point2 variant, which contained a trivial (but not default) constructor:

__host__ __device__
Point2() { x = 0; y = 0; }

Our Point3 variant used a default constructor instead, without host or device annotations.
Once applie to Point2, the error disappears in Debug mode.

Point2() = default;

I have no idea, why this is, it certainly makes even less sense with the error message than before. The Point2 was mentioned in the linker error somewhere, but it was obscured by typical C++ linker error messages with lots of random characters…

That being said, optixModuleCreate crashed with an invalid memory access (looked like some uninitialized memory, with the address having lots of 'F’s), but I could not see that any of the inputs where incorrect.

I went on and removed the -G flag, used the optmizied / non-debug OptiX module like you suggested. Now it doesn’t crash anymore. Can this be related to “your” driver bug as well?

That changing a constructor to the explicit default fixes the problem for you, matches exactly the problem description by the engineer who worked on the issue. In that case, the unoptimized device code contained some function code which was incorrectly rejected by the OptiX compiler.

I went on and removed the -G flag, used the optimized / non-debug OptiX module like you suggested. Now it doesn’t crash anymore. Can this be related to “your” driver bug as well?

Yes, the issue is only happening with unoptimized debug device code.

That being said, optixModuleCreate crashed with an invalid memory access (looked like some uninitialized memory, with the address having lots of 'F’s), but I could not see that any of the inputs where incorrect.

Not sure what’s going on there now with the given information.
When using OptiX-IR binary input to optixModuleCreate, make sure to hand it exactly the right data size.
Do not append any trailing zero data etc. especially when embedding module input data into the application data.
See https://forums.developer.nvidia.com/t/embedding-optix-ir/273199/8

1 Like

Right now I do not have the time to check the size. It used to work with the same code earlier in the project (i.e. when we still used ptx code), so it should have been the same then and size be correct.

But since we can avoid the crash and still have C++ Debug mode, we will wait for the driver fix and retest afterwards. Thank you for your help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.