NVRTC: Missing stdint.h

Hello,
I am trying for days now to enable essential RT capabilities on an
M1000M GPU. After some searching I found, a way using the Nvidia Driver 451.77 for Win10, Cuda 11.0.2, Optix 7.1, Visual Studio 2019, and restricting myself to sm_50 compute capabilities.

I also managed to get most of the examples (all but Raycasting, I assume >sm_50 is required) running.

However, I have obtained a more complex project that is running perfectly fine on different machines with Titan RTX, Win 10 or 1080Ti Win10, and that always throws a runtime compilation error on my machine.

"
Caught exception: NVRTC Compilation failed.
SomePath/foo.h(30): catastrophic error: cannot open source file ‘stdint.h’

1 catastrophic error detected in the compilation of SomePath/geometry.cu
"

geometry.cu includes foo.h, which contains on line 30 a regular include of stdint.h as
"#pragma once
#include <stdint.h>
"

From what I gathered, it seems that the include paths for NVRTC are not setup correctly.

I tried reinstalling everything multiple times on a clean system, tried adjusting the CUDA compilation settings in the CUDA Visual Studio integration, and I tried to find any information on how to set the paths for the NVRTC compiler.

Does anyone know how to work this problem?
Please let me know, if you need anymore information.

The NVIDIA CUDA Runtime Compiler NVRTC is meant to compile CUDA device code to PTX device code at runtime and the idea of using that with OptiX is to be able to compile runtime generated CUDA shader programs (e.g. material shader networks) on end-user machines without a host compiler installed.

Means including any host compiler header in OptiX device code like stdint.h should be strictly avoided.

The recommended solution would be to remove the host compiler stdint.h include from the device code and see which variable types it complains about and either replace these with their native C++ types or, if that would be too much work, lift the necessary typedefs from the stdint.h header locally.

If you’re not actually generating CUDA device programs at runtime, you could either use the NVIDIA CUDA Compiler NVCC to translate the CUDA files to PTX at build time of your project or add the necessary host compiler include directories to the NVRTC compile options.
https://forums.developer.nvidia.com/t/a-supported-nvidia-gpu-could-not-be-found/144180/8

I’ve generated material shaders as CUDA source code at runtime like this and translated them with NVRTC in the past using OptiX 5 and 6.
https://forums.developer.nvidia.com/t/optix-compile-error-in-ptxstring/70636/2

These NVRTC compile options are given to the nvrtcCompileProgram(prog, numOptions, options) call which should be somewhere in your executable’s source code.

What I’m guessing to be the issue in your specific case is that the program actually requires the host compiler on the end-user target machine but hard-coded a specific host compiler include path which exists on the other machines which work but you do not have on your local machine.
This is generally the wrong thing to do because end-user machines can’t be expected to have a host compiler installed under Windows at all. Instead you need to remove the requirement for a host compiler completely when using NVRTC to translate runtime generated shader code.

Also note that CUDA 11 has already deprecated the SM 5.0 streaming multiprocessor version. I would recommend CUDA 10.x when still targeting this SM 5.0 version to avoid the deprecation warnings.

Thank you for this swift and detailed response.

The project I was working on, was based wrt. compilation settings on optixWhitted. I ran a grep over the entire source code of the optix SDK and didn’t find a single reference to a nvrtc call. Thus, I disabled it in the sampleConfig.h and later as described in the first link you provided.

The problem of missing libraries is gone now - so thank you again. But I still don’t feel I really understood why NVRTC didn’t work. I also don’t know where/how to check the host compiler configuration.

Also, I am now faced with a different problem, that sutil::samplePTXFilePath couldn’t locate geometry.cu for said example project.

However, I:

  1. ensured that the folder of the project, a subdirectory of Optix…/SDK/ is properly added to Optix/SDK/CMakeList.txt.
  2. geometry.cu is placed in the root folder as in optixWhitted + the function call of sutil::getPtxString is identical to optixWhitted
  3. The CMakeList.txt of the project is of identical structure to optixWhitted.

If both project were generated, starting from the same Optix…/SDK/CMakeList.txt, how come that once sutil::getPtxString finds geometry.cu and once not. While debugging, I found that g_ptxSourceCache is actually empty. Which, leads me to believe that upon compilation nvcc does not seem to be called properly or there are no ptx files generated. How could I verify/modify that?

I realized that I didn’t know enough about the NVRTC vs. NVCC capabilities and the entire dependency chain from Optix. Is there any documentation on the Optix 7.1 compilation process basically a “who calls who” with their respective capabilities?

For the time being I use the CUDA 11 only because I want to make sure that the high-end machines are feature complete.

With best regards

I ran a grep over the entire source code of the optix SDK and didn’t find a single reference to a nvrtc call.

You’re doing it wrong. ;-)
Searching over all *.h and *.cpp files inside the OptiX SDK 7.1.0 turns up 40 hits for nvrtc.
The NVRTC compilation happens inside the OptiX SDK 7.1.0\SDK\sutil\sutil.cpp file in function getPtxFromCuString().

Also, I am now faced with a different problem, that sutil::samplePTXFilePath couldn’t locate geometry.cu for said example project.

From your description of adding an own example to the OptiX SDK examples, that should have worked.

Did you run it from within the debugger? If yes, is the debugger’s working directory the same as the module directory?
If not, does it work when starting the application without the debugger from the location it was built to?

I’m asking because the OptiX SDK examples compile hard-coded paths to the PTX and data folders into the executable and won’t work when run from another location unless two environment variables are set. You can find them inside the code. Look in sutil.cpp again for getSampleDir() and samplePTXFilePath().
https://forums.developer.nvidia.com/t/compile-optix-7-0-as-exe/124791/12

I’m not using any of that in my own OptiX 7 application framework. (Find github links in the sticky posts.)
See this post about possible solutions. (That post also linked to the NVRTC related link above, two post earlier there.)
https://forums.developer.nvidia.com/t/sdk-samples-sutil-getptxstring-file-path/70963/2

I realized that I didn’t know enough about the NVRTC vs. NVCC capabilities and the entire dependency chain from Optix. Is there any documentation on the Optix 7.1 compilation process basically a “who calls who” with their respective capabilities?

The NVRTC compilation is isolated to the getPtxFromCuString() function. That explains it all.
The CUDA to PTX compilation using NVCC inside the example projects is done with custom build rules per *.cu file generated by CMake. Please see this thread for more information on what the actual NVCC command line arguments are and how to print them in CMake.
https://forums.developer.nvidia.com/t/cmake-dont-compile-cuda-kernels/140937
https://forums.developer.nvidia.com/t/optix-7-and-msvs-2017-hello-world-on-windows-10/141711
https://forums.developer.nvidia.com/t/how-would-you-generally-compile-and-run-a-program-which-has-optix-without-make-or-cmake/75302/2

As for the problem of missing PTX files, I noticed that I made a change in the CMakeList file of the SDK exampled from C++11 to C++17, by changing
“set(CMAKE_CXX_STANDARD 11)” which broke part of the path in another place. If I run instead CMake with the C++11 and manually set it to C++17 for my project in VS, everything finally compiles, when NVRTC is disabled.

Reading SDK Samples: sutil :: getPtxString? file path? was helpful in this context, as I knew where to look for PTX files then.

I think after all, once I find appropriate literature or proper examples, I will move away from the optix examples build chain entirely and do as you suggested, placing the PTX files in a path relative to the binary.

So all compilation problems seem to be resolved. Instead, the entire program just crashes at some point now, after opening a GUFW window in “view” mode. So after all it seems my HW is incompatible with the RT code.

Best
T.

So all compilation problems seem to be resolved. Instead, the entire program just crashes at some point now, after opening a GUFW window in “view” mode. So after all it seems my HW is incompatible with the RT code.

Wait, but the other OptiX SDK 7.1.0 examples work?
Because I have a Quadro M1000M in my laptop and could run OptiX 7.0.0 programs. Though since that is a rather slow board for GPU ray tracing, my programs try to keep the interactivity well beyond the infamous Windows Timeout Detection and Recovery (TDR) limit of 2 seconds per kernel invocation.

(I haven’t updated to R450 drivers and OptiX 7.1.0 on that yet, but wouldn’t expect a different result. I develop and test on two beefy remote desktop systems with Quadro Pascal and dual Quadro RTX NVLINK configurations)

Not all examples work. All but the optixRaycasting. I haven’t figured out yet, why this specific one does not work.

As for my project, the crashes seemed quite severe. That’s why I thought it might be a problem with the feature set. I haven’t checked out the differences in compute capabilities of sm_50-sm_75 yet.

But after some debugging I finally found out that in view mode it crashed at a “CUDA_SYNC_CHECK()” instruction because it ran out of memory and had an out-of-bound memory access. Changing the size of the viewing window from 768x768 to 512x512 and removing some big meshes from the scene resolved the issue. Either one of the measures didn’t suffice.

I don’t fully understand the memory management yet, as Windows and VS report only ~256MB of 2GB to be in use.

We needed the Optix 7.1, and I saw that then R450 is a requirement. R450 existed only for Ubuntu and Win10, but converting the examples CMake file to Ubuntu was painful. So I went for now with this debugging platform with Win10 + R450. If anyone would get the 7.1 examples working with Ubuntu I would be interested to hear about it.

The optixRaycasting example is using some native CUDA kernels to generate the primary rays and do the shading.
The OptiX SDK 7.1.0 examples compile for SM 6.0 (Pascal) by default to avoid having deprecation warning for SM 5.0 (Maxwell) when using CUDA 11.
Maybe not all places have been set to SM 5.0 in your CMake scripts to work on Maxwell.
https://forums.developer.nvidia.com/t/a-supported-nvidia-gpu-could-not-be-found/144180/6
https://forums.developer.nvidia.com/t/optix-7-1-issue-with-running-samples-on-a-maxwell-card/140118/2

I haven’t checked out the differences in compute capabilities of sm_50-sm_75 yet.

That is mostly irrelevant. You can simply use SM 5.0 as target for the PTX compilation to be able to run your OptiX application on all supported GPU architectures.
https://forums.developer.nvidia.com/t/support-multiple-compute-capabilities/126076/2

We needed the Optix 7.1, and I saw that then R450 is a requirement. R450 existed only for Ubuntu and Win10, but converting the examples CMake file to Ubuntu was painful. So I went for now with this debugging platform with Win10 + R450. If anyone would get the 7.1 examples working with Ubuntu I would be interested to hear about it.

Which Ubuntu, CMake, GCC and CUDA version have you been using?

The OptiX SDK examples should work out-of-the-box without changing the CMakeLists.txt files on any supported OS.
If that is no the case, you would need to add some more information what exactly you changed and why.