Problem with running OptiX 6.5 program. "invalid value for --gpu-architecture"

Hi all,

I am new to OptiX. I am trying to writing a simple program to exam my understanding to OptiX.

However, I got problems with compiling my OptiX program. I compile my main.cpp file with nvcc by following command:

nvcc -m64 -gencode arch=compute_75,code=sm_75 -I${OPTIX_PATH}/include -I${OPTIX_PATH}/SDK -o main.exe main.cpp -L${OPTIX_PATH}/lib64 -loptix -loptixu -lsutil_sdk

There were no errors when compilation. But when I tried running the program, I got:

OptiX Error: 'NVRTC Compilation failed.
nvrtc: error: invalid value for --gpu-architecture (-arch)

Here are my system infomations:
System: Ubuntu 20.04
GPU: RTX 2060
NVIDIA Driver Version: 450.66
CUDA Version: 11.0
OptiX Version: 6.5

I have upload my program to GitHub GitHub - Adiolph/OptiX-test: A simple test project for learning OptiX

Can someone help me figure out which part is wrong?

Thanks in advance!
Fan


EDIT:
I tried build the OptiX SDK program by cmake, but I get the same error.
It seems like there is some thing wrong with my CUDA or NVIDIA Driver. But I can’t figure out why.


EDIT2:
After update my driver version to 455.23.04, I am able to run OptiX 7.2 SDK samples. But I still get error when running OptiX 6.5 SDK samples.

First of all, if you’re starting new with OptiX and you can run the OptiX SDK 7.2.0 examples, there is little reason to look at earlier versions.
OptiX 6.5.0 and OptiX 7 have a completely different API and OptiX 7 is the future.
Please read this post: https://forums.developer.nvidia.com/t/optix-7-breaking-changes/156801

For more resources about OptiX 7, follow the links in the sticky posts of this sub-forum.
https://forums.developer.nvidia.com/t/optix-7-2-release/156619

About your issues:

1.) OptiX SDK 6.5.0 is not officially supporting CUDA 11 versions.
Please always refer to the OptiX Release Notes before setting up a development environment. (Link directly beneath the individual OptiX version’s download buttons.)
I would recommend using CUDA 10.1 for OptiX 6.5.0, because that’s the version with which it was build.

2.) Seems only the Linux makefile does anything.

I compile my main.cpp file with nvcc by following command:

This is not necessary.
In an OptiX program you would normally only compile the CUDA files which contain the device code with the OptiX programs from *.cu to *.ptx source.
The OptiX SDK examples use a custom build rule for each *.cu file to accomplish that.
For a different version of such custom build rule generation with CMake look into my OptiX 7 applications:
https://github.com/NVIDIA/OptiX_Apps/blob/master/3rdparty/CMake/nvcuda_compile_module.cmake
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo10/CMakeLists.txt#L195

These PTX source files get loaded via the rtProgramCreateFromPTXString() of rtProgramCreateFromPTXFile() functions inside OptiX 1 to 6 versions, resp. with optixModuleCreateFromPTX() in OptiX 7 versions.

3.) There is no need to compile to the latest/current GPU’s streaming multi-processor version.
There actually have been cases where the OptiX PTX parser was not supporting all latest PTX instructions, yet.
If you want to support all GPU architectures with only a single version of the PTX files, you should compile against the lowest supported streaming multiprocessor version which would be SM 5.0 (Maxwell) for both OptiX 6 and 7 versions.
See these posts and the links in them: https://forums.developer.nvidia.com/t/support-multiple-compute-capabilities/126076/2

Note that CUDA 11 removed support for SM 3.0 and 3.2 and deprecated support for SM 3.5, 3.7 (Kepler), and 5.0 (Maxwell).
That’s why there is a -Wno-deprecated-gpu-targets in my NVCC_OPTIONS.
The OptiX SDK 7 examples build against SM 6.0 (Pascal) by default to avoid the deprecation warnings.
https://forums.developer.nvidia.com/t/optix-7-1-issue-with-running-samples-on-a-maxwell-card/140118

4.) CUDA 11 changed some defines for the 64-bit detection in NVRTC. It requries a check or #if defined(__x86_64) instead of #if defined(_WIN64). The shipping OptiX SDK 6.5.0 is not doing that. As said, it’s not officially supporting CUDA 11 versions.

Inside the OptiX SDK CMake build, you can disable the use of NVRTC by disabling the CMake variable CUDA_NVRTC_ENABLED. The *.ptx files are then compiled during build-time with the custom rules using NVCC.

2 Likes

Thanks for your detailed reply! I have one more small question about compilation.

This is not necessary.
In an OptiX program you would normally only compile the CUDA files which contain the device code with the OptiX programs from *.cu to *.ptx source.

Do you mean that I only need to use nvcc to compile *.cu to *.ptx files? It looks like the NVCUDA_COMPILE_PTX function in your cmake files only compile *cu files. I am not familiar with cmake and have some difficulties reading the cmake file.

Then what should I do with the *.cpp files? Just use normal gcc or use nvcc?
I guess it should be nvcc. Then are the pre-compiled *.ptx the sources files that should be compiled (or linked) along with with *.cpp? Or I only need to compile *.cpp and the excutable file will automatically find the *.ptx files at the running time?

It seems like you have add ${PTX_SOURCES} files in the add_excutable cmake functions, so the *.ptx files should be compiled along with *.cpp? I am not sure…

Thanks,
Fan

1.) The *.cu files with the OptiX device programs only need to be translated from CUDA to PTX source.
That can either be done with NVCC at compile time or with NVRTC at runtime.

(Note that your application could also contain native CUDA kernels (apart from OptiX device programs) which you can handle as you need.
That might actually not be too seldom with OptiX 7 applications since these are using CUDA runtime or driver host API calls to manage most resources used in OptiX 7.)

The resulting *.ptx source files are neither linked, nor automatically found.
Again, it’s the developer’s responsibility to provide that PTX input source to OptiX at application runtime.
In OptiX 7 that’s done with optixModuleCreateFromPTX().
(In OptiX 1 to 6 that is done with rtProgramCreateFromPTXString() of rtProgramCreateFromPTXFile())

Please read the OptiX Programming Guide(s): https://raytracing-docs.nvidia.com/

2.) All host code only needs to be translated by the host compiler, like for example GCC under Linux or Visual Studio under Windows.
The CUDA toolkits set the main requirements for this. Check the CUDA_Installation_Guide_Linux.pdf resp. CUDA_Installation_Guide_Windows.pdf files in your CUDA toolkit installation’s documentation folder which contains a table with the compatible host compiler versions at the beginning.

3.) In my OptiX 7 examples that works like this:
The NVCUDA_COMPILE_PTX script (EDIT: Now replaced with the nvcuda_compile_module.cmake inside 3rdparty/CMake) called here:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/CMakeLists.txt#L122
generates one custom build rule per *.cu files I listed in SHADERS and uses the list of files in SHADER_HEADERS as dependencies so that when I change a file of either list, the project recompiles the affected *.cu files.

If you want to see the individual NVCC command lines which are used per *.cu file, you would only need to add this message command into line 42 of my nvcuda_compile_ptx.cmake script.
message("${CUDA_NVCC_EXECUTABLE} " "--machine=64;" "--ptx;" "${NVCUDA_COMPILE_PTX_NVCC_OPTIONS} " "${input} " "-o " "${output}")
These get printed when configuring the CMake project.
Means you could even put these individual commands in a script and translate your CUDA files to PTX outside the project.

The resulting *.ptx filenames land in PTX_SOURCES and that is added to the add_executable() list to make sure the custom build rules are called on these when building the executable. They aren’t linked or added to the executable binary in any way.

I also put them into the files of the solution (I’m working under Windows) in source_group( "ptx" FILES ${PTX_SOURCES}) to be able to open them quickly in case I want to see the effect of a change in the *.cu files in source_group( "shaders" FILES ${SHADERS} )

The OptiX SDK 7 examples do that similarly but it’s harder to see. Follow the OPTIX_add_sample_executable definition and look for the CUDA_WRAP_SRCS function in FindCUDA.cmake.

1 Like

Dear NVIDIA Developer,

I have tried using your NVCUDA_COMPILE_PTX macro in my cmake file. Every thing looks good, but I get an Parse error while calling rtProgramCreateFromPTXString.
I don’t know what this error mean. Dose it mean I have syntax error in my *.cu files? Or I am using wrong NVCC options while compiling?

I have upload my code in here: GitHub - Adiolph/OptiX-test2

The NVCUDA_COMPILE_PTX macro expands to:

cd /home/lab110/optix-test && /usr/local/cuda/bin/nvcc --machine=64 --ptx -gencode=arch=compute_30,code=sm_30 --use_fast_math --relocatable-device-code=true --generate-line-info -Wno-deprecated-gpu-targets -I/usr/local/optix/include -I/home/lab110/optix-test/cuda -I/usr/local/cuda/include cuda/sphere.cu -o /home/lab110/optix-test/build/ptx/sphere.ptx

Here are my system infomations:
System: Ubuntu 20.04
GPU: RTX 2060
NVIDIA Driver Version: 455.23.04
CUDA Version: 10.1
OptiX Version: 6.5

Thanks in advance!
Fan

Your error is that your read_file_ptx() routine is actually not reading the PTX file contents!
It only generates the path to the *.ptx file and that is the incorrect input to rtProgramCreateFromPTXString().
That should have been used with rtProgramCreateFromPTXFile().

Here’s a routine which actually reads the PTX (or OptiX IR) file contents into a std::string which also doesn’t leak the strdup() memory like in your code: readData

Sidenotes:
If you have issues with input PTX source, you should have provided that failing PTX source. It could have been corrupted.

You changed the original NVCC command line argument from --gpu-architecture=compute_30 (resp. compute_50 because Kepler GPUs aren’t supported anymore since OptiX 6.0.0) to "-gencode=arch=compute_${COMPUTE_CAPABILITY},code=sm_${COMPUTE_CAPABILITY}"
Why? My examples build under Windows and Linux as they are.

In your CMake output, check why the OptiX_INCLUDE_DIR prints an empty string.
Which FindOptiX.cmake are you using.
That is, where is that -I/usr/local/optix/include inside the NVCC options coming from?
You’re using OptiX_INCLUDE and not OptiX_INCLUDE_DIR so there is some potential mixup of FindOptiX.cmake scripts.
Is that /usr/local/optix/include containing the expected OptiX 6.5.0 headers?

You have a duplicated debug output Create ptx source: ... before the parse error. One from within read_ptx_file(), one in main.cc.

1 Like

Thank you so much for the rapid reply!
I can finally run my simple test file now! :D

I am not familiar with C-style string process. Thanks for pointing out the memory leak problem, I will read your example codes.

When I learned CUDA before, I used this "-gencode=arch=compute_${COMPUTE_CAPABILITY},code=sm_${COMPUTE_CAPABILITY}" argument every time compiling .cu files. I got the Parse error and I thought maybe this argument can work. So I changed to this.
Do you mean that --gpu-architecture=compute_30 have the equal usage and is suitable for both LINUX and Windows so I should always use this?

OptiX_INCLUDE_DIR is a mistake, I will change it to OptiX_INCLUDE.

The Create ptx source: ... in main.cc is used to test if the program acctually read the file name (in case of some error in my read_file_ptx() function). I don’t need it now. :D

Cheers,
Fan