Upgrade from Optix 7.2 to 7.3 or 7.4 modules trouble

Hi folks. I’m on CentOS 8 with 495.44 driver. I’m attempting to upgrade from 7.2 to 7.4, and have also tried 7.3 along the way. I’m getting the follow error

Check failed: resultCode == OPTIX_SUCCESS (7204 vs. 0) Invalid use of optix device function

when calling optixModuleCreateFromPTX. For the 7.3 upgrade I do not have to change any code for my renderer to compile, and for 7.4, I only had to change OptixModuleCompileOptions::debugLevel.

Looking at the release notes I saw that some fields were added to OptixModuleCompileOptions and OptixPipelineCompileOptions, and it was recommended to zero-init the structures, which I was already doing.

Can anyone provide a hint for what might be going wrong between Optix 7.2 and Optix 7.3 that might result in the error mentioned above?

Interesting. I haven’t seen that error before.

That error can happen for multiple reasons which are too many to list.

Isn’t there any additional compilation error information?
Do you have set an OptixDeviceContextOptions log callback and set it to the maximum level 4?
Example code: https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/src/Application.cpp#L762

Bummer. I’m already setting this to level 4.

After recompiling the shaders, I get errors like this

.../third-party/glm/glm/glm/detail/type_vec2.hpp(94): warning: __device__ annotation is ignored on a function("vec") that is explicitly defaulted on its first declaration

.../third-party/glm/glm/glm/detail/type_vec2.hpp(94): warning: __host__ annotation is ignored on a function("vec") that is explicitly defaulted on its first declaration

But otherwise there are no shader compilation errors nor C++ compilation errors/warnings

I’m curious what would be the approach folks would use to try to identify the problem to fix this? There is really not a lot of information here, and I would have expected something fairly obvious since my code worked (works) with 7.2

Hi Brian,
since those warnings are all we can go on, I’d follow that track. Are all the errors you’re getting warnings about vec constructors from the glm library? Have you checked if those also happen with 7.3 or 7.2?
Given that this is a (device code/CUDA) compile issue in the GLM library, are you on the latest known good version of that lib?
I would probably try to comment out the parts of your device programs that use those constructors, or more aggressively that use GLM and try to chase down the offending code.

Thanks, I may look into that, though I see the exact same warnings when I compile for use with OptiX 7.2 (cuda is exactly the same). My colleague successfully got version 7.3 working with our code but against 470 drivers, so I may down-adjust my drivers and just use 7.3 for now if I can replicate his success.

Would you be able to provide the failing PTX source code?

That would allow to see if we can reproduce this in-house. If confidentiality is required, you could attach it to a personal message or send it to the OptiX-Help e-mail address listed on the OptiX download page. (Note that our e-mail servers will block some attachment extensions like *.zip and all kinds of executables which would need to be renamed.)

The question is which OptiX device function optixModuleCreateFromPTX is complaining about.
When I said this can happen for many cases, the first thing would be to check if any OptiX device function is used inside a program domain where it’s not supported.
This table list that: https://raytracing-docs.nvidia.com/optix7/guide/index.html#device_side_functions#device-side-functions
Such cases should have triggered that error in all versions though and even if not, that couldn’t have worked.

But there are also some more obscure cases like calling optixThrowException without having OPTIX_EXCEPTION_FLAG_USER enabled.

1 Like

Thanks. Email sent with attached PTX files

Thanks, they arrived.

I’ve used one of my OptiX 7 examples compiled against OptiX 7.4.0 and current Windows 10 R495 display drivers 496.49 and called optixModuleCreateFromPTX with the two PTX files.

They are incompatible with OptiX 7.3.0 (ABI 47) and 7.4.0 (ABI 55) because they have been compiled with OptiX 7.2.0 (ABI 41) headers.

With OptixDeviceContextOptions log callbacks set and its level set to 4, I get the respective errors inside the log output.
It complains about optixTrace with one and two payload arguments and optixSetPayload_0 and optixSetPayload_1.
The error message is always
PTX: (function_name) is not supported with an ABI version higher than 45 (current: 55).
Please recompile your PTX with a newer version of OptiX.(function_name).

Means when porting your application to OptiX 7.3.0 or 7.4.0 you must recompile all PTX input code with the matching OptiX SDK version headers to have the device functions signature match the OPTIX_ABI_VERSION for which the OptiX entry point function table is initialized.

Oh man, now I feel dumb. I had my logs going to glog and didn’t have the error level set high enough, so it wasn’t displaying that to the screen. Thanks!

One strange thing I just encountered while trying to revert my PTX was the following:

  1. old PTX, failed
  2. new PTX, worked
  3. old PTX, worked

I think there may be something fishy going on with PTX caching. If I modify the cpp so that it rebuilds, the old PTX will fail again.

I rebuilt the PTX several times during my investigation, but not in a very directed way, and I must have never done it against the correct OptiX headers :(

Thanks again for the help Detlef.

Not sure what is going on there. The caching of OptiX and CUDA programs shouldn’t match for PTX from different OptiX versions, esp. not if the assembly of the mismatching functions has changed. That sounds more like some issue with the build environment not updating to the correct include folder when switching OptiX SDK versions.

I’m handling that with CMake and custom build rules for the *.cu files in my examples and haven’t seen that happening when changing the OptiX versions. That’s effectively a “rebuild solution” step anyway.
That is handled inside the CMakeLists.txt and *.cmake files of these examples: https://forums.developer.nvidia.com/t/optix-advanced-samples-on-github/48410/4

You should be able to see if programs have been found inside the cache from previous runs, although they shouldn’t, by deleting the cached data, running with new and then old PTX code, and having the OptixDeviceContextOptions log callback running at maximum level 4 as well.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.