Optix-IR seems to fail for me with vector types (Updated with reproduction)

bjorn24 · December 5, 2025, 10:10am

This function compiles fine with PTX but fails with Optix-IR. Specifically it fails on this line:

float inv_dim_x = 1.0f / (float)dim.x;

The only thing I changed is -optix-ir instead of -ptx

I am using sm_86 on cuda 13.1 and optix 9.0

It fails when I call optixModuleCreate

My raygen program that calls computeRay will end up with 0 instructions

static forceinline device void
computeRay(uint3 idx, uint3 dim, float3 &origin, float3 &direction) {
   const float3 U = params.cam_u;
   const float3 V = params.cam_v;
   const float3 W = params.cam_w;

   // Use reciprocal multiplication instead of division
   float inv_dim_x = 1.0f / (float)dim.x;
   float inv_dim_y = 1.0f / (float)dim.y;

   float u = 2.0f * (float)idx.x * inv_dim_x - 1.0f;
   float v = 2.0f * (float)idx.y * inv_dim_y - 1.0f;

   origin = params.cam_eye;

   // Compute direction
   float3 dir;
   dir.x = u * U.x + v * V.x + W.x;
   dir.y = u * U.y + v * V.y + W.y;
   dir.z = u * U.z + v * V.z + W.z;

   // Normalize
   float len = sqrtf(dir.x * dir.x + dir.y * dir.y + dir.z * dir.z);
   float inv_len = 1.0f / len;
   direction = make_float3(dir.x * inv_len, dir.y * inv_len, dir.z * inv_len);
}

EDIT:

Here is a reproduction, please check the README for build instructions

optix_ir_bug_repro.zip (8.8 KB)

You can actually toggle the reproduction step with this minimal raygen

this will not error:

static forceinline device void
computeRay(uint3 idx, uint3 dim, float3 &origin, float3 &direction) {
const float3 U = params.cam_u;
const float3 V = params.cam_v;
const float3 W = params.cam_w;

// BUG TRIGGER: These division operations cause OptiX IR JIT to fail
// Use reciprocal multiplication instead of division
float inv_dim_x = 1.0f; // / (float)dim.x;

direction = make_float3(inv_dim_x, 0.f, 0.f);
}

this will:

static forceinline device void
computeRay(uint3 idx, uint3 dim, float3 &origin, float3 &direction) {
const float3 U = params.cam_u;
const float3 V = params.cam_v;
const float3 W = params.cam_w;

// BUG TRIGGER: These division operations cause OptiX IR JIT to fail
// Use reciprocal multiplication instead of division
float inv_dim_x = 1.0f / (float)dim.x;

direction = make_float3(inv_dim_x, 0.f, 0.f);
}

build steps

mkdir build && cd build
cmake .. -G “Visual Studio 17 2022” -DUSE_OPTIX_IR=ON
cmake --build . --config Release
Release\optix_ir_bug_repro.exe

^^ fails

cmake .. -G “Visual Studio 17 2022” -DUSE_OPTIX_IR=OFF
cmake --build . --config Release
Release\optix_ir_bug_repro.exe

^^ works

my system info:

±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.44                 Driver Version: 591.44         CUDA Version: 13.1     |
±----------------------------------------±-----------------------±---------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060 …  WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   62C    P0            752W /   60W |       0MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
±----------------------------------------------------------------------------------------+

Windows version

systeminfo | findstr /B /C:“OS Name” /C:“OS Version”
OS Name:                       Microsoft Windows 11 Pro
OS Version:                    10.0.26200 N/A Build 26200

CUDA version

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Nov__7_19:25:04_Pacific_Standard_Time_2025
Cuda compilation tools, release 13.1, V13.1.80
Build cuda_13.1.r13.1/compiler.36836380_0
-- The CXX compiler identification is MSVC 19.44.35221.0

bjorn24 · December 6, 2025, 5:15am

I have edited the original post to include a way to reproduce the issue!

lspano · December 8, 2025, 9:00am

Hello @bjorn24 , thank you for the reproducer! I can repro the issue with driver 591.44, but not with 581.80.

The error I see from your reproducer: Error: Pipeline link error (code 7251) and a Warning: Requested debug level “OPTIX_COMPILE_DEBUG_LEVEL_FULL”, but input module does not include full debug information before that.

I have tested the OptiX 9.0 SDK and the OptiX_Apps too, no issue there. It probably depends on the CMakeLists.txt.

lspano · December 8, 2025, 3:10pm

This works for me:

nvcc -optix-ir -arch=sm_86 --use_fast_math -I“C:\ProgramData\NVIDIA Corporation\OptiX SDK 9.0.0\include” ..\triangle.cu

without the fast math option I get the crash.

Note that the SDK 9.0 compiles the samples with:

nvcc -optix-ir -arch=sm_50 --use_fast_math -lineinfo -Wno-deprecated-gpu-targets --use-local-env -I%sdkinclude% file.cu

bjorn24 · December 9, 2025, 5:19am

Thank you for taking a look! Yes, using -use_fast_math worked for me!

Thank you, I am now using optix-ir.

Is it a requirement to run with –use_fast_math? With the clang compiler I have found -ffast-masth to be problematic in the past. I generally avoid making those compiler optimization trade offs, but I suppose this option in nvcc is a bit different?

lspano · December 9, 2025, 8:53am

It is a workaround for this particular driver version. If you don’t need high precision (this is true in many scenarios), fast math will speed up your code without noticeable numerical issues. For more details there’s the CUDA manual 5.5. Floating-Point Computation — CUDA Programming Guide , this forum and if you want to dig deep What Every Computer Scientist Should Know About Floating-Point Arithmetic

bjorn24 · December 9, 2025, 6:01pm

Yes this is fine for me. I am just learning still, thank you!

Topic		Replies	Views
OptiX 6.0.0 is broken on driver 591.44 OptiX optix	25	3242	January 5, 2026
float number error OptiX	8	1189	June 14, 2022
PTX case infinite waiting, OptixIR works flawless OptiX	10	93	May 29, 2026
[OptiX 7.5] Debuggable OptiX-IR makes isnan() not working OptiX	7	1316	December 4, 2023
__int128 causes optix complaining OPTIX_ERROR_INTERNAL_COMPILER_ERROR OptiX nvbugs , compile	4	839	January 2, 2023
Debuggable OptiX-IR makes launching a pipeline failed OptiX	4	786	January 12, 2025
Working Optix 3 App fails Optix 4 OptiX	9	868	June 14, 2022
COMPILE ERROR: failed to create pipeline OptiX with no further information in logs OptiX compile	5	2669	March 21, 2022
Apparently an unexplicable error OptiX	9	3140	June 14, 2022
COMPILER]: COMPILE ERROR: No functions with semantic types found OptiX	4	167	August 27, 2024

Optix-IR seems to fail for me with vector types (Updated with reproduction)

Related topics