COMPILE ERROR: failed to create pipeline OptiX with no further information in logs

I’m writing openGL app with raytraced shadows by Optix, so the idea is to use only raygen and miss program (reference), but I got into a pickle and can’t progress further. I’m using NVRTC to compile .cu into ptx string, which works fine. The main module is built, program groups for miss and raygen program are built with no problem as well. But the optixPipelineCreate fails with

[ 2][COMPILE FEEDBACK]: COMPILE ERROR: failed to create pipeline
Info: Pipeline has 1 module(s), 2 entry function(s), 1 trace call(s), 0 continuation callable call(s), 0 direct callable call(s), 2 basic block(s) in entry functions, 8 instruction(s) in entry functions, 7 non-entry function(s), 53 basic block(s) in non-entry functions, 627 instruction(s) in non-entry functions

My .cu file is very minimalistic so far, I just want to know it works before writing all the fragment to light traces and light source logic. (pipeline creating fails even with empty functions)

#include <optix.h>
#include "optixSettings.h"

extern "C"
	__constant__ Params params;

static __forceinline__ __device__ void computeOrAndDir(uint3 idx, uint3 dim, float3& origin, float3& direction)
	int fa = 50;
	fa += 10;

//raygen function to generate new rays
extern "C" __global__ void __raygen__rg() {
	//get thread indices
	const uint3 idx = optixGetLaunchIndex();
	const uint3 dim = optixGetLaunchDimensions();

	//get ray origin from the input matrix and direction from lights input.
	float3 ray_origin, ray_direction;
	computeOrAndDir(idx, dim, ray_origin, ray_direction);
	//for each light sample call optixTrace
	unsigned int isVisible = 0;
	float distance = 0; //distance of origin and light sample pos
	float epsilon = 0.01; //offset so we don't hit the same triangle
		distance - epsilon,
		0, 0,
		0, //missSBTindex
		isVisible //payload
	if (isVisible) {
		//set output matrix to 1.0
	else {
		//set output matrix to 0.0

extern "C" __global__ void __miss__ms() {


I was trying to follow the programmers guide while writing the program creation code with some modifications. The scene GAS is built from OpenGL VBO and EBO, ray origins will be taken from OpenGL geometry pass matrix.

void Raytracer::setUpPrograms() {
	OptixModuleCompileOptions module_compile_options = {};
	module_compile_options.maxRegisterCount = OPTIX_COMPILE_DEFAULT_MAX_REGISTER_COUNT;

	OptixPipelineCompileOptions pipeline_compile_options = {};
	pipeline_compile_options.usesMotionBlur = false;
	pipeline_compile_options.numPayloadValues = 1; // only one int in payload (visibility)
	//pipeline_compile_options.numAttributeValues = 1;
	pipeline_compile_options.traversableGraphFlags = OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_GAS; // check if preTransform is applied
	pipeline_compile_options.pipelineLaunchParamsVariableName = "params";
	pipeline_compile_options.usesPrimitiveTypeFlags = OPTIX_PRIMITIVE_TYPE_FLAGS_TRIANGLE;

	char log[2048]; // For error reporting from OptiX creation functions
	OptixModule module = nullptr;
	size_t ptxSize = 0;
	const char* ptxData = readPtxFile(ptxSize);

	size_t sizeof_log = sizeof(log);
		ptxData, ptxSize,
		log, &sizeof_log,
	//create program groups
	OptixProgramGroup raygen_prog_group = nullptr;
	OptixProgramGroup miss_prog_group = nullptr;

	OptixProgramGroupOptions program_group_options = {}; // Initialize to zeros
	OptixProgramGroupDesc raygen_prog_group_desc = {}; //
	raygen_prog_group_desc.kind = OPTIX_PROGRAM_GROUP_KIND_RAYGEN;
	raygen_prog_group_desc.raygen.module = module;
	raygen_prog_group_desc.raygen.entryFunctionName = "__raygen__rg";

		1,   // num program groups

	OptixProgramGroupDesc miss_prog_group_desc = {};
	miss_prog_group_desc.kind = OPTIX_PROGRAM_GROUP_KIND_MISS;
	miss_prog_group_desc.miss.module = module;
	miss_prog_group_desc.miss.entryFunctionName = "__miss__ms";

		1,   // num program groups

 // Link pipeline
	OptixPipeline  pipeline = nullptr;
	const uint32_t max_trace_depth = 1;
	OptixProgramGroup program_groups[] = { raygen_prog_group, miss_prog_group };

	OptixPipelineLinkOptions pipeline_link_options = {};
	pipeline_link_options.maxTraceDepth = max_trace_depth;
	pipeline_link_options.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_FULL;

		sizeof(program_groups) / sizeof(program_groups[0]),

Thinking the issue was in .cu compilation I’ve moved from usign VS to nvcc the .cu file to rather using nvrtc according to sample code, but the issue persisted. Also I have a tendency of making stupid mistakes. Any helpful comment is greatly appreciated.

Hard to say what’s going on without minimal and complete reproducer.

First of all, please provide the following system configuration information when asking about OptiX issues:
OS version, installed GPU(s), VRAM amount, display driver version, OptiX (major.minor.micro) version, CUDA toolkit version (major.minor) used to generate the input PTX, host compiler version.

Have there been any addition information inside your log string?

What has been the NVRTC options resp. the NVCC command line when compiling the *.cu to *.ptx code?

Maybe provide the resulting PTX code as well. (You can attach files to your forum posts.)

I would have set the pipeline_compile_options.numAttributeValues = 2; explicitly which is the minimum default. OptiX should clamp that to 2 for smaller values though.
(Not important, but pipeline_compile_options.usesMotionBlur is an int and not a boolean. The initialization should use 0 for better style.)
Your float3 ray_origin and ray_direction are uninitialized by the current code.
optixTrace won’t work with the current t_min and t_max settings because it’s not fulfilling the condition 0.0 <= t_min < t_max.
I would recommend implementing the device code from a working empty version step-by-step to see if any added construct breaks it.

I would try with the module and pipeline compile and link options set to full optimization and no debug as well.

Did you try enabling the OptixDeviceContextValidationMode validationMode inside the OptixDeviceContextOptions to see if there are additional information. It probably doesn’t get that far.

1 Like

I’m running Windows 11 pro, desktop Nvidia RTX 2070 super with 8GBs of memory, GeForce game ready driver 511.79, OptiX 7.4.0, CUDA v11.6.
I’ve used default compile options:

  "-std=c++11", \
  "-arch", \
  "compute_50", \
  "-use_fast_math", \
  "-lineinfo", \
  "-default-device", \
  "-rdc", \
  "true", \

plus the includes:

  "C:/ProgramData/NVIDIA Corporation/OptiX SDK 7.4.0/include", \
  "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/include", 

I’ve tried turning on validation mode, the CUDA error 700 emerged:

[ 2][COMPILE FEEDBACK]: COMPILE ERROR: failed to create pipeline
Error releasing namedConstant's internal resources (CUDA error string: an illegal memory access was encountered, CUDA error code: 700)
Info: Pipeline has 1 module(s), 2 entry function(s), 0 trace call(s), 0 continuation callable call(s), 0 direct callable call(s), 2 basic block(s) in entry functions, 2 instruction(s) in entry functions, 7 non-entry function(s), 53 basic block(s) in non-entry functions, 627 instruction(s) in non-entry functions

Might’ve guessed there’s an issue with my sloppy memory management, but why is it emerging only when creating the pipeline? I’ve reduced the .cu code to

#include <optix.h>
#include "optixSettings.h"

extern "C"
	__constant__ Params params;

//raygen function to generate new rays
extern "C" __global__ void __raygen__rg() {


extern "C" __global__ void __miss__ms() {

The resulting .ptx is
programs.ptx (1.1 KB)
I’m also including my whole optix code class
raytracerC.cpp (14.7 KB)
raytracerC.h (901 Bytes)
Still the issue persists. The program pipeline creation should reference only the optix context and ptx file, there’s no way it could fail because of previous GAS build or registering opengl buffers to CUDA, right?

I loaded your programs.ptx into one of my OptiX 7 programs and setup the compile and link options the same and could not reproduce any error on 511.09 or 511.79 drivers on either a Turing (RTX 6000) or Ampere (RTX A6000) board.

It’s bascially this code with changes to match your arguments:

  OptixModuleCompileOptions mco = {};

  mco.optLevel   = OPTIX_COMPILE_OPTIMIZATION_LEVEL_0; // No optimizations.
  mco.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_FULL;     // Full debug. Never profile kernels with this setting!
  mco.optLevel   = OPTIX_COMPILE_OPTIMIZATION_LEVEL_3; // All optimizations, is the default.
  // Keep generated line info for Nsight Compute profiling. (NVCC_OPTIONS use --generate-line-info in CMakeLists.txt)
#if (OPTIX_VERSION >= 70400)

  OptixPipelineCompileOptions pco = {};

  pco.usesMotionBlur        = 0;
  pco.numPayloadValues      = 1;
  pco.numAttributeValues    = 2;  // The minimum is two for the barycentrics.
                       OPTIX_EXCEPTION_FLAG_TRACE_DEPTH |
                       OPTIX_EXCEPTION_FLAG_USER |
  pco.exceptionFlags = OPTIX_EXCEPTION_FLAG_NONE;
  pco.pipelineLaunchParamsVariableName = "params";

#if (OPTIX_VERSION != 70000)
  // Only using built-in Triangles in this renderer. 
  // This is the recommended setting for best performance then.
  pco.usesPrimitiveTypeFlags = OPTIX_PRIMITIVE_TYPE_FLAGS_TRIANGLE; // New in OptiX 7.1.0.

  OptixModule module = nullptr;
  std::string ptx = readPTX("./programs.ptx"); // Your programs.ptx file.
  OPTIX_CHECK( m_api.optixModuleCreateFromPTX(m_optixContext, &mco, &pco, ptx.c_str(), ptx.size(), nullptr, nullptr, &module) );

  std::vector<OptixProgramGroupDesc> programGroupDescriptions(2); // Hardcoded two programs, 0 = raygen, 1 = miss
  memset(, 0, sizeof(OptixProgramGroupDesc) * programGroupDescriptions.size());
  OptixProgramGroupDesc* pgd;

  // All of these first because they are SbtRecordHeader and put into a single CUDA memory block.
  pgd = &programGroupDescriptions[0]; // raygen
  pgd->raygen.module = module;
  pgd->raygen.entryFunctionName = "__raygen__rg";

  pgd = &programGroupDescriptions[1]; // miss
  pgd->miss.module = module;
  pgd->miss.entryFunctionName = "__miss__ms";

  OptixProgramGroupOptions pgo = {}; // This is a just placeholder.

  std::vector<OptixProgramGroup> programGroups(programGroupDescriptions.size());
  OPTIX_CHECK( m_api.optixProgramGroupCreate(m_optixContext,, (unsigned int) programGroupDescriptions.size(), &pgo, nullptr, nullptr, );

  OptixPipelineLinkOptions plo = {};

  plo.maxTraceDepth = 1;
  plo.debugLevel = OPTIX_COMPILE_DEBUG_LEVEL_FULL; // Full debug. Never profile kernels with this setting!
  // Keep generated line info for Nsight Compute profiling. (NVCC_OPTIONS use --generate-line-info in CMakeLists.txt)
#if (OPTIX_VERSION >= 70400)
#if (OPTIX_VERSION == 70000)
  plo.overrideUsesMotionBlur = 0; // Does not exist in OptiX 7.1.0.

  OPTIX_CHECK( m_api.optixPipelineCreate(m_optixContext, &pco, &plo,, (unsigned int) programGroups.size(), nullptr, nullptr, &m_pipeline) );

Your usage of the sizeof_log variable is incorrect. That is an input-output argument and you need to reset the available buffer size to the sizeof(log) before every call using it.
While this method can help with multi-threaded logging, it’s simpler to use nullptr for the log and size arguments and setup a global callback and level inside the OptixDeviceContextOptions. Example in the same file linked above:

Also note that optixProgramGroupCreate() doesn’t need to be called for each program group individually but can generate all program groups in one call with vectors. Mind that its numProgramGroups argument is an unsigned int, not a size_t.

I cannot help further without a minimal and complete reproducer in failing state.

Hi droettger,
thank you for your advices. I’ve finally managed to fix the issue and everything is working perfectly now. I messed up the dptrVBO pointer while creating build inputs as seen below. It was always created on the same adress, so every loop, all previous build input vertexBuffer were overwritten. The build passed, but the memory was corrupted, so it failed later on.

	for (Model* mod : mSc->mModels) {
		for (Mesh mesh : mod->meshes) {
			//map VBO and EBO
			CUDA_CHECK(cudaGraphicsMapResources(1, &(mCudaVBOs[idxMesh]), 0));
			CUDA_CHECK(cudaGraphicsMapResources(1, &(mCudaEBOs[idxMesh]), 0));

			CUdeviceptr dptrVBO;
			size_t num_bytesVBO;
			CUDA_CHECK(cudaGraphicsResourceGetMappedPointer((void**)&dptrVBO, &num_bytesVBO, mCudaVBOs[idxMesh]));

			CUdeviceptr dptrEBO;
			size_t num_bytesEBO;
			CUDA_CHECK(cudaGraphicsResourceGetMappedPointer((void**)&dptrEBO, &num_bytesEBO, mCudaEBOs[idxMesh]));

			//setup build inputs
			OptixBuildInputTriangleArray buildInput = {};

			buildInput.vertexFormat = OPTIX_VERTEX_FORMAT_FLOAT3;
			buildInput.vertexStrideInBytes = sizeof(Vertex);
			buildInput.numVertices = static_cast<uint32_t>(mesh.vertices.size());
			buildInput.vertexBuffers = &dptrVBO;

			buildInput.indexBuffer = dptrEBO;
			buildInput.numIndexTriplets = mesh.indices.size() / 3;
			buildInput.indexFormat = OPTIX_INDICES_FORMAT_UNSIGNED_INT3;
			buildInput.indexStrideInBytes = sizeof(int3);

Once again, thanks for your help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.