Generating ptx code using OptiX 7.4 and CMake

Good afternoon,

I am trying to generate some PTX code from CUDA code for raygeneration program and all goes will until I try to call optixTrace in the code. When I do, I get the following PTX output error:
ptxas ptxcode.ptx, line 30; error : Call to '_optix_get_launch_index_x' requires call prototype

My defined CMakeLists.txt file uses the arch=sm_60 with CXX standard 17, calling the embedd_optix.cmake CMake macro in the tools/ directory follows:

cmake_minimum_required(VERSION 3.15 FATAL_ERROR)
project(RTX_PROJECT 0.1 LANGUAGES CXX CUDA)

set(CMAKE_CUDA_ARCHITECTURES 60)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED True)

# OptiX 7.4 headers
set(OPTIX_DIR /path_to_optix_740)
set(OPTIX_INCLUDE ${OPTIX_DIR}/include)

# RF Scene generator headers
set(RTX_INCLUDE ${PROJECT_SOURCE_DIR}/src
                ${PROJECT_SOURCE_DIR}/src/shaders)

set(CMAKE_CUDA_SEPARABLE_COMPILATION ON)

# Create PTX from shaders
include(tools/embbed_optix.cmake)
embbed_optix_shader(ptxcode src/shaders/ptxcode.cu)

# Files for building executable including the PTX
# CUDA code
set(SOURCES src/driver.cpp 
            src/shaders/params.hpp
            ${ptxcode})

add_executable(xRTX ${SOURCES})

# Creates library rtx
add_subdirectory(src/ray_trace)

target_link_directories(xRTX PRIVATE ${OPTIX_DIR})
target_link_libraries(xRTX rtx)
target_include_directories(xRTX PUBLIC ${RTX_INCLUDE}
                                       ${OPTIX_INCLUDE})

The following is the cmake macro embbed_optix_shader in the tools/ directory that generates the PTX code during compilation:

set(BIN2C /usr/local/cuda-11/bin/bin2c)
set(OPTIX_INCLUDE /path_to_optix740/include)

macro(embbed_optix_shader output_var shader_file)
    add_library(${output_var}_lib OBJECT ${shader_file})
    set_target_properties(${output_var}_lib PROPERTIES CUDA_PTX_COMPILATION ON)
    target_compile_options(${output_var}_lib PRIVATE --generate-line-info -use_fast_math -arch=sm_60 --keep)
    target_include_directories(${output_var}_lib PUBLIC ${OPTIX_INCLUDE})

    add_custom_command(
        COMMAND ${BIN2C} --padd 0 --type char --name ${output_var} ${output_var}.ptx > ${output_var}.cpp
        OUTPUT ${output_var}.cpp
        DEPENDS ${output_var}_lib
    )

    set(${output_var} ${output_var}.cpp)
endmacro()

I am hoping it is something simple in the above code that is causing the issue - a flag maybe? A line I have neglected to include perhaps?

The CUDA code being compiled via the above cmake macro follows:

#include <cuda_runtime.h>
#include <optix.h>
#include <optix_device.h>
#include <optix_stubs.h>
#include "params.hpp"            // launch parameters
#include "sbt.hpp"               // base Shader Binding Table
#include "sbt_raygeneration.hpp" // extended SBT for raygeneration
#include "vec_math.hpp"          // vector operations

extern "C" {
    /**
     * @brief Defines OptiX launch parameters that EVERY program
     * has access to
     */
    __constant__ Params params;

    /** 
     * @brief CUDA raygeneration program for OptiX, generates rays
     */
    __global__ void __raygen__render() {
        const uint3 idx = optixGetLaunchIndex();
        const uint3 dim = optixGetLaunchDimensions();

        // Cast SBT to raygeneration SBT
        const SbtRaygeneration* rtData = reinterpret_cast<SbtRaygeneration*>(optixGetSbtDataPointer());
        const float3 U = rtData->camera_u;
        const float3 V = rtData->camera_v;
        const float3 W = rtData->camera_w;
        const float2 d = 2.0f * make_float2(static_cast<float>(idx.x) / static_cast<float>(dim.x),
                                            static_cast<float>(idx.y) / static_cast<float>(dim.y)) - 1.0f;

        const float3 origin = rtData->camera_eye;
        const float3 direction = normalize(d.x * U + d.y * V + W);

        float3 prd = make_float3(0.5f, 0.5f, 0.5f);
        uint32_t p0, p1, p2;
        p0 = __float_as_int(prd.x);
        p1 = __float_as_int(prd.y);
        p2 = __float_as_int(prd.z);

        // NOTE: The following appears to generate the problem in PTX code
        optixTrace(params.handle, origin, direction, 0.0f, 1e16f, 0.0f, OptixVisibilityMask(1),
                   OPTIX_RAY_FLAG_NONE, 0, 0, 0, p0, p1, p2);
        prd.x = __int_as_float(p0);
        prd.y = __int_as_float(p1);
        prd.z = __int_as_float(p2);

        params.image[idx.y * params.image_width + idx.x] = make_color(prd);
    }

Thank you to anyone with any hints or ideas as to what I am doing wrong.

Hey @picard1969,

Is part of the problem that ptxas is being executed? If you’re trying to generate PTX code, then perhaps allowing ptxas to run is going one step too far?

BTW, I doubt you want optix_stubs.h in your device side code, since it only defines host-side stubs.

I assume that your SDK samples, like optixPathTracer, build & run without error? (This would rule out version problems of any tools involved, since the path tracer’s raygen program is roughly the same structure as yours.)


David.

Thank you for the fast response @dhart

I will remove the optix_stubs.h since it is pointless in device side code.

I really only want to generate the PTX code, not necessarily execute it. Do you have any suggestions as the optimal way to do this from within cmake - something like a cmake macro or is that overkill ?

Thanks again.

My cmake fu isn’t great. ;) I’m guessing the main issue here is piggy-backing on a cmake build pipeline that’s just doing slightly more than you really want? One option perhaps is to define another custom command for your nvcc step, instead of defining your PTX as a ‘target’. That way you might have a bit more direct control over the steps & inputs & outputs.


David.

target_compile_options(${output_var}_lib PRIVATE --generate-line-info -use_fast_math -arch=sm_60 --keep)

If these are all your NVCC command line options, you’re missing the --ptx flag which is telling the compiler to only translate from *.cu to *.ptx code. You’ve most likely compiled to cubins. Simply look at your *.ptx output files in a text editor. If that is not human readable PTX assembly text, you’re doing it wrong.

In my OptiX applications I’m using a custom build rule for each *.cu file which is implemented here. That predates the newer native CMake support for CUDA which I haven’t looked at.
Note the commented message instruction in line 45 which will print the actual NVCC command line to the CMake output window. Compare that to the command lines you’re using.
https://github.com/NVIDIA/OptiX_Apps/blob/master/3rdparty/CMake/nvcuda_compile_ptx.cmake
Usage is shown here:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/nvlink_shared/CMakeLists.txt#L168

We had that discussion before: https://forums.developer.nvidia.com/t/simple-ptx-shader-optix-7/165303

(Mind that payload and attribute registers are of type unsigned int. For cleaner code, use __uint_as_float and __float_as_uint for the reinterpret casts. The next OptiX SDK release is going to correct that in its examples.)

1 Like

duh. You are correct, we have had the discussion before. I apologize.

Thanks,