Transformation Matrices from OpenGL reusable in OptiX pass

Hello,

I’ve been using OptiX to generate ray-traced shadows for an OpenGL scene, which sort of works

What doesn’t work so far is, when I use the rotation slider, which, of course, results in this behaviour:

So I was wondering if it is somehow possible to rotate the specific geometry branch in the acceleration structure or do I have to rebuild the whole structure with the transformed geometry every time that rotation value changes?

I also checked, whether the optixMotionGeometry example might offer what I’m looking for, but I couldn’t verify it, as the binary fails with a misaligned address errror

Thank you

Hi @Gummel!

I don’t understand what’s going on in the 2nd image. What is happening, and why? Are some of your shadows turning into geometry that doesn’t get rotated? Or are the shadows in a 2nd screen buffer that is composited? Or something else? What parts of the BVH are out of sync, and how are you expecting a BVH update to resolve the situation?

I’m not sure, but it sounds like the question is whether there is a faster way to rebuild an acceleration structure when you only want to move geometry around. If so, the answer is yes. You can insert your geometry into the BVH as an instance with a matrix transform for translating & rotating your geometry, and then you can perform a dynamic update on your acceleration structure when you change only the instance transform and not any of the rest of the geometry. This is much faster than rebuilding your acceleration structure. When using instances, you will have two acceleration structures: a top level Instance Acceleration Structure (IAS) that contains instances and their transforms, and a bottom level Geometry Acceleration Structure (GAS) which contains the geometry. When updating instance transforms, you will only need to update the IAS, which in your example here will be very small. Does that help? Hopefully I haven’t totally misunderstood your question.

For the bug report on optixMotionGeometry, which OS, driver, GPU, and OptiX versions are you using?


David.

Hello David,

I’ve got a slide bar in the GUI for the object rotation which simply creates a rotation matrix that transforms the OpenGL geometry (red dragon) for each frame. There’s also a model selector in the GUI, which replaces the geometry with the new model (and also updates the acceleration structure of the OptiXRenderer). So OptiX is basically generating a screen space texture containing the shadow factors (by the way using the inverted projection and view matrix with large far clipping plane values for now, if you remember, as I had to make progress with the renderer for an upcoming paper) and in OpenGL, the shader is using this texture to shade the correct images.

However, since I have no clue (other than regenerating the acceleration structure with the geometry data that was already rotated by the amount given by the GUI on the CPU, which I do without the transformation every time I change the model (cube, dragon, tree) geometry) where to fill in the transformation in my modified tutorial code (the tutorial doesn’t cover transformations not at all, unfortunately) the images generated by OpenGL and OptiX diverge when the model rotation is changed. So far, to init the renderer, I generate the initial acceleration structure

OptixTraversableHandle cgbv::optix::OptixRenderer::build_accelleration_structure()
{
	std::cout << "Building Acceleration Structure...";

	for (auto& buffer : vertex_buffer)
		buffer.free();

	for (auto& buffer : normal_buffer)
		buffer.free();

	for (auto& buffer : index_buffer)
		buffer.free();

	vertex_buffer.resize(meshes.size());
	normal_buffer.resize(meshes.size());
	index_buffer.resize(meshes.size());

	OptixTraversableHandle accel_structure_handle = 0ull;

	// Triangle Inputs
	// -----------------------------------------------------------
	std::vector<OptixBuildInput> triangle_input(meshes.size());
	std::vector<CUdeviceptr> device_vertices(meshes.size());
	std::vector<CUdeviceptr> device_indices(meshes.size());
	std::vector<uint32_t> triangle_input_flags(meshes.size());

	for (int mesh_id = 0; mesh_id < static_cast<int>(meshes.size()); ++mesh_id)
	{
		auto& model = meshes[mesh_id];
		vertex_buffer[mesh_id].alloc_and_upload(model.vertex);
		index_buffer[mesh_id].alloc_and_upload(model.index);
		
		if(!model.normal.empty())
			normal_buffer[mesh_id].alloc_and_upload(model.normal);

		triangle_input[mesh_id] = {};
		triangle_input[mesh_id].type = OPTIX_BUILD_INPUT_TYPE_TRIANGLES;

		device_vertices[mesh_id] = vertex_buffer[mesh_id].get_device_pointer();
		device_indices[mesh_id] = index_buffer[mesh_id].get_device_pointer();

		triangle_input[mesh_id].triangleArray.vertexFormat = OPTIX_VERTEX_FORMAT_FLOAT3;
		triangle_input[mesh_id].triangleArray.vertexStrideInBytes = sizeof(glm::vec3);
		triangle_input[mesh_id].triangleArray.numVertices = static_cast<int>(model.vertex.size());
		triangle_input[mesh_id].triangleArray.vertexBuffers = &device_vertices[mesh_id];
					  
		triangle_input[mesh_id].triangleArray.indexFormat = OPTIX_INDICES_FORMAT_UNSIGNED_INT3;
		triangle_input[mesh_id].triangleArray.indexStrideInBytes = sizeof(glm::ivec3);
		triangle_input[mesh_id].triangleArray.numIndexTriplets = static_cast<int>(model.index.size());
		triangle_input[mesh_id].triangleArray.indexBuffer = device_indices[mesh_id];

		triangle_input_flags[mesh_id] = 0;

		triangle_input[mesh_id].triangleArray.flags = &triangle_input_flags[mesh_id];
		triangle_input[mesh_id].triangleArray.numSbtRecords = 1;
		triangle_input[mesh_id].triangleArray.sbtIndexOffsetBuffer = 0;
		triangle_input[mesh_id].triangleArray.sbtIndexOffsetSizeInBytes = 0;
		triangle_input[mesh_id].triangleArray.sbtIndexOffsetStrideInBytes = 0;
	}
	// -----------------------------------------------------------

	// BLAS setup
	// -----------------------------------------------------------
	OptixAccelBuildOptions accel_options = {};
	accel_options.buildFlags = OPTIX_BUILD_FLAG_NONE | OPTIX_BUILD_FLAG_ALLOW_COMPACTION;

	accel_options.motionOptions.numKeys = 1;
	accel_options.operation = OPTIX_BUILD_OPERATION_BUILD;

	OptixAccelBufferSizes blas_buffer_sizes;
	optix::error::check(optixAccelComputeMemoryUsage(optix.context, &accel_options, triangle_input.data(), static_cast<int>(meshes.size()), &blas_buffer_sizes));
	// -----------------------------------------------------------

	// Prepare Compaction
	// -----------------------------------------------------------
	CUDABuffer compacted_size_buffer;
	compacted_size_buffer.alloc(sizeof(uint64_t));

	OptixAccelEmitDesc emit_descriptor;
	emit_descriptor.type = OPTIX_PROPERTY_TYPE_COMPACTED_SIZE;
	emit_descriptor.result = compacted_size_buffer.get_device_pointer();
	// -----------------------------------------------------------

	// execuite build (main stage)
	// -----------------------------------------------------------
	CUDABuffer temp_buffer;
	temp_buffer.alloc(blas_buffer_sizes.tempSizeInBytes);

	CUDABuffer output_buffer;
	output_buffer.alloc(blas_buffer_sizes.outputSizeInBytes);

	optix::error::check(optixAccelBuild(optix.context, 0, &accel_options, triangle_input.data(), static_cast<int>(meshes.size()), temp_buffer.get_device_pointer(), temp_buffer.get_size_in_bytes(), output_buffer.get_device_pointer(), output_buffer.get_size_in_bytes(), &accel_structure_handle, &emit_descriptor, 1));
	cuda::error::cuda_sync_check();
	// -----------------------------------------------------------

	// perform compaction
	// -----------------------------------------------------------
	uint64_t compacted_size;
	compacted_size_buffer.download(&compacted_size, 1);

	//acceleration_structure_buffer.alloc(compacted_size);
	acceleration_structure_buffer.resize(compacted_size);
	optix::error::check(optixAccelCompact(optix.context, 0, accel_structure_handle, acceleration_structure_buffer.get_device_pointer(), acceleration_structure_buffer.get_size_in_bytes(), &accel_structure_handle));
	cuda::error::cuda_sync_check();
	// -----------------------------------------------------------

	// Clean Up
	// -----------------------------------------------------------
	output_buffer.free();
	temp_buffer.free();
	compacted_size_buffer.free();
	// -----------------------------------------------------------

	meshes_touched = false;

	std::cout << "done" << std::endl;

	return accel_structure_handle;
}

and then I generate the Shader Binary Table

void cgbv::optix::OptixRenderer::build_shader_binary_table()
{
	std::cout << "Building Shader Binary Table...";

	// Raygen Records
	// ----------------------------------------------------------------------------------
	std::vector<optix::RaygenRecord> raygen_records;
	for (int i = 0; i < optix.raygen_pg.size(); ++i)
	{
		optix::RaygenRecord record;
		optix::error::check(optixSbtRecordPackHeader(optix.raygen_pg[i], &record));
		record.data = nullptr;
		raygen_records.push_back(record);
	}
	optix.raygen_records_buffer.alloc_and_upload(raygen_records);
	optix.shader_binding_table.raygenRecord = optix.raygen_records_buffer.get_device_pointer();
	// ----------------------------------------------------------------------------------

	// Miss Records
	// ----------------------------------------------------------------------------------
	std::vector<optix::MissRecord> miss_records;
	for (int i = 0; i < optix.miss_pg.size(); ++i)
	{
		optix::MissRecord record;
		optix::error::check(optixSbtRecordPackHeader(optix.miss_pg[i], &record));
		record.data = nullptr;
		miss_records.push_back(record);
	}
	optix.miss_records_buffer.alloc_and_upload(miss_records);
	optix.shader_binding_table.missRecordBase = optix.miss_records_buffer.get_device_pointer();
	optix.shader_binding_table.missRecordStrideInBytes = sizeof(optix::MissRecord);
	optix.shader_binding_table.missRecordCount = static_cast<int>(miss_records.size());
	// ----------------------------------------------------------------------------------
	
	// Hitgroup Records 
	// ----------------------------------------------------------------------------------
	update_hitgroup_pg_for_sbt();
	// ----------------------------------------------------------------------------------

	std::cout << "done" << std::endl;
}

and to update the hitgroup programme groups I call

void cgbv::optix::OptixRenderer::update_hitgroup_pg_for_sbt()
{
	if (optix.hitgroup_records_buffer.get_device_pointer())
		optix.hitgroup_records_buffer.free();

	int num_objects = static_cast<int>(meshes.size());
	std::vector<optix::HitgroupRecord> hitgroup_records;
	for (int mesh_id = 0; mesh_id < num_objects; ++mesh_id)
	{
		for (int ray_id = 0; ray_id < static_cast<int>(optix::RayType::Count); ++ray_id)
		{
			optix::HitgroupRecord record;

			// all meshes use the same code, so all same hit group
			optix::error::check(optixSbtRecordPackHeader(optix.hitgroup_pg[ray_id], &record));

			record.data.vertex = reinterpret_cast<glm::vec3*>(vertex_buffer[mesh_id].get_device_pointer());
			record.data.normal = reinterpret_cast<glm::vec3*>(normal_buffer[mesh_id].get_device_pointer());
			record.data.index = reinterpret_cast<glm::ivec3*>(index_buffer[mesh_id].get_device_pointer());
			record.data.colour = meshes[mesh_id].colour;

			hitgroup_records.push_back(record);
		}
	}
	optix.hitgroup_records_buffer.alloc_and_upload(hitgroup_records);
	optix.shader_binding_table.hitgroupRecordBase = optix.hitgroup_records_buffer.get_device_pointer();
	optix.shader_binding_table.hitgroupRecordStrideInBytes = sizeof(optix::HitgroupRecord);
	optix.shader_binding_table.hitgroupRecordCount = static_cast<int>(hitgroup_records.size());
}

So, when I change the model geometry (to cube, tree or back to dragon), I basically call

void cgbv::optix::OptixRenderer::update_geometry_structure()
{
	launch_params.traversable = build_accelleration_structure();
	update_hitgroup_pg_for_sbt();
}

and create everything new from scratch (there was a warning that the acceleration structure may degenerate quickly when frequently changed). So, from the first glimpse at the links you provided, I probably have to change quite a bit there, eh?

Regarding optixMotionGeometry, I’m using Win11, Driver is 528.02, GPU is a 3090 and the OptiX Version is 7.6.0

Thank you very much for your help!

Markus

there was a warning that the acceleration structure may degenerate quickly when frequently changed

This warning is referring to changing the vertex locations of geometry when updating a GAS, or when moving many instances in a top-level IAS. This isn’t a concern if you only have 1 or 2 instances in your IAS, and you only update their instance matrix transforms.

Do keep in mind that in your example with only 1 or 2 instances, your IAS build is going to be trivial and extremely fast. It probably won’t matter whether you update it, or rebuild it from scratch. Setting your scene up so that you have an IAS and you only need to rebuild your IAS (and not your GAS) when the dragon rotates is going to make the updates very fast. Even doing a full rebuild on the IAS will be much faster than even using the UPDATE operation on the dragon GAS. Once you have an IAS, you will be able to completely rebuild the IAS every frame, if you want, and have no problem keeping it fast enough for real-time frame rates.

So, from the first glimpse at the links you provided, I probably have to change quite a bit there, eh?

I don’t think the change is very big, actually. All you need is to build an IAS (Instance Accel), which will point to your GAS (geometry accel, aka BLAS like in your comments). Then you pass the IAS handle to raygen instead of the GAS handle. OptiX takes care of traversing through the IAS and GAS together when you call optixTrace().

There isn’t much if anything that needs to change with your Program Groups or Shader Binding Table, though if you end up with more than 1 GAS then you might need to tweak your SBT offsets.

Have a look at the optixHair SDK sample, it makes use of an IAS. At the top of optixHair.cpp you’ll find first a function to build the GAS (makeHairGAS()) followed by another function to build the IAS (makeInstanceAccelerationStructure()). Take a look at how they’re called, and how the returned results get used, and I think it will make sense.


David.

Another example which is only rebuilding the root IAS (and an SRT motion matrix) can be found inside my intro_motion_blur example.

This code is doing the IAS update when any animation parameter is changed inside the GUI:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_motion_blur/src/Application.cpp#L2475

This is the initial IAS build which allows updates:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_motion_blur/src/Application.cpp#L1812

Note that I keep track of the IAS data including the temporary device allocation to avoid allocating these every time:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_motion_blur/inc/Application.h#L463

Thank you both for answering :-)

I’ll have a look and hopefully can get it to work quickly.

Hello again,

I kept studying the acceleration structures a bit and ended up with a couple of questions:

  • so, if I have a geometry-AS handle, I can attach this to an instance-AS to transform the entire geometry-AS, right? So, if my geometry-AS contains a ground surface model (like the plane in the images) and a center model (like the dragon), probably both models will get rotated if I specify a rotation matrix, right? If I only want the dragon to be rotated, I probably need to have two geometry-AS, one structure gA containing the dragon and one structure gB with the ground surface and an instance-AS iA to which gA is attached, right? But this requires somehow a mechanism to combine gB and iA (with gA attached), right? Is there a way to do this or did I once again end up with a wrong mental model?

  • Considering the 3x4 matrix that an instance-AS expects, why is OptiX using 3x4 matrices instead of the common 4x4 matrices used in computer graphics and how are they related (if at all)? How does OptiX perform the vertex transformation with that 3x4 matrix?

Thank you
Markus

Hi Markus,

Your first three sentences are fully correct. It sounds like the missing link might be that an IAS is normally built with multiple instances - there is a straightforward mechanism for putting links to both gA and gB into your single IAS named iA. Each instance gets it’s own transform matrix, so you control their placement and orientation separately.

Both 3x4 and 4x4 matrices are very common in computer graphics & games. The 3x4 is sometimes used, for example, when you want a ‘rigid body transform’, which includes translation, rotation, and scale, but excludes the rarely-ever-used components of a 4x4 that add shearing to the transform. Saving the 12 bytes of memory per transform is very useful for cases where people need to have millions of transforms, and saving one row of dot product is nice for reducing computation while rays are bouncing around your scene. The 3x4 is applied the same way a 4x4 would be, the 3x4 just has some implicit values taken from the 4x4 identity matrix.


David.

Example code again.

This is manually building a number of GAS and puts each under an OptixInstance with a different transform
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/src/Application.cpp#L1598
which are then built into into an IAS.
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/src/Application.cpp#L1705

The more advanced examples in that repository traverse an arbitrarily deep host side scene graph and flatten it to a render graph with a two-level acceleration structure (IAS->GAS) which is fully hardware accelerated on RTX GPUs.
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo9/src/Raytracer.cpp#L523
Because these examples supports multi-GPU, the IAS gets build per device.
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo9/src/Device.cpp#L1519

EDIT: Ok, nevermind, I just found out what was wrong. Had to increase the the maxTraversableGraphDepth parameter when calling optixPipelineSetStackSize to two and add the OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING flag to the traversableGraphFlags of the pipeline compile options.

Hello,

so, I’ve created (or at least, tried to) an instance acceleration structure with this function:

OptixTraversableHandle cgbv::optix::OptixRenderer::build_instance_acceleration_structure()
{
	OptixTraversableHandle accel_structure_handle = 0ull;

	std::vector<OptixInstance> instances(1);

	auto x = sizeof(glm::mat3x4);

	auto y = glm::value_ptr(glm::mat3x4());

	for (int i = 0; i < instances.size(); ++i)
	{
		auto identity = glm::mat3x4(1.f);
		std::copy(glm::value_ptr(identity), glm::value_ptr(identity) + sizeof(glm::mat3x4) / sizeof(float), instances[i].transform);
		instances[i].instanceId = i;
		instances[i].visibilityMask = 255;
		instances[i].sbtOffset = 0;
		instances[i].flags = OPTIX_INSTANCE_FLAG_NONE;
		instances[i].traversableHandle = geometry_acceleration_structure;
	}

	CUDABuffer instance_buffer;
	instance_buffer.resize_and_upload(instances);


	// instance acceleration structure setup
	// -----------------------------------------------------------
	std::vector<OptixBuildInput> instanceInput(1);

	instanceInput[0].type = OPTIX_BUILD_INPUT_TYPE_INSTANCES;
	instanceInput[0].instanceArray.instances = instance_buffer.get_device_pointer();
	instanceInput[0].instanceArray.numInstances = static_cast<int>(instances.size());

	OptixAccelBuildOptions accel_build_options = {};

	accel_build_options.buildFlags = OPTIX_BUILD_FLAG_NONE | OPTIX_BUILD_FLAG_ALLOW_COMPACTION;
	accel_build_options.operation = OPTIX_BUILD_OPERATION_BUILD;

	OptixAccelBufferSizes ias_buffer_sizes = {};

	optix::error::check(optixAccelComputeMemoryUsage(optix.context, &accel_build_options, instanceInput.data(), 1, &ias_buffer_sizes));
	// -----------------------------------------------------------


	// Prepare Compaction
	// -----------------------------------------------------------
	CUDABuffer compacted_size_buffer;
	compacted_size_buffer.alloc(sizeof(uint64_t));

	OptixAccelEmitDesc emit_descriptor;
	emit_descriptor.type = OPTIX_PROPERTY_TYPE_COMPACTED_SIZE;
	emit_descriptor.result = compacted_size_buffer.get_device_pointer();
	// -----------------------------------------------------------


	// execute build (main stage)
	// -----------------------------------------------------------
	CUDABuffer temp_buffer;
	temp_buffer.alloc(ias_buffer_sizes.tempSizeInBytes);

	CUDABuffer output_buffer;
	output_buffer.resize(ias_buffer_sizes.outputSizeInBytes);

	optix::error::check(optixAccelBuild(optix.context, 0, &accel_build_options, instanceInput.data(), static_cast<int>(instanceInput.size()), temp_buffer.get_device_pointer(), temp_buffer.get_size_in_bytes(), output_buffer.get_device_pointer(), output_buffer.get_size_in_bytes(), &accel_structure_handle, &emit_descriptor, 1));
	cuda::error::cuda_sync_check();
	// -----------------------------------------------------------

	// perform compaction
	// -----------------------------------------------------------
	uint64_t compacted_size;
	compacted_size_buffer.download(&compacted_size, 1);

	instance_acceleration_structure_buffer.resize(compacted_size);
	optix::error::check(optixAccelCompact(optix.context, 0, accel_structure_handle, instance_acceleration_structure_buffer.get_device_pointer(), instance_acceleration_structure_buffer.get_size_in_bytes(), &accel_structure_handle));
	cuda::error::cuda_sync_check();
	// -----------------------------------------------------------

	// Clean Up
	// -----------------------------------------------------------
	temp_buffer.free();
	// -----------------------------------------------------------


	return accel_structure_handle;
}

geometry_acceleration_structure is what was returned in the build_acceleration_structure function further up in this thread. I’ve also set the traversable handle to the new instance acceleration structure in the optixLaunchParams, but either the optix-check or the optix-sync-check after calling optixLaunch(...) reports an invalid traversable:

[02][ERROR       ]: Validation mode caught builtin exception OPTIX_EXCEPTION_CODE_TRAVERSAL_INVALID_TRAVERSABLE
Error recording resource event on user stream (CUDA error string: unspecified launch failure, CUDA error code: 719)

I also tried to create the acceleration structure without compaction and tried to use the call to optixAccelBuild as shown in OptiX_Apps/Application.cpp at master · NVIDIA/OptiX_Apps · GitHub but this function sets a nullptr as the emittedProperties argument, which I tried, but that resulted in an error that optixAccelBuild wants to set emitted properties, which it can’t due to the nullptr.

So, I was wondering if a compacted instance acceleration structure is valid at all.

The call to ´optixTrace` looks something like this, but I haven’t changed anything in the optix shaders:

optixTrace(optixLaunchParams.traversable, cu_camera_position, cu_ray_dir, t_min, t_max, 0.0f /* rayTime */, OptixVisibilityMask(255), OPTIX_RAY_FLAG_DISABLE_ANYHIT /*OPTIX_RAY_FLAG_NONE*/, static_cast<unsigned int>(optix::RayType::Radiance) /*SBT offset*/, static_cast<unsigned int>(optix::RayType::Count) /*SBT stride*/, static_cast<unsigned int>(optix::RayType::Radiance) /*missSBTIndex*/, u0, u1);

Do you have an idea or a hint, why the instance acceleration structure might be invalid?

Thank you very much!

Hello again,

so I got the two geometry acceleration structures for the ground and the center model running, which are then used in each instance of the instance acceleration structure. The ground is transformed with an identity matrix and the center model is transformed with the transposed rotation matrix used in OpenGL (transposed because column-major- to row-major-order). So, when combined with the OpenGL rendering, everything looks fine (1st is raytraced, 2nd is shadow mapping):



However, if I rotate the center object the self-shadowing of it is totally out of hand. The shadow on the ground surface, however, is as I would expect it, so that shows that the center acceleration structure is probably correctly rotated (first image rotated raytraced shadows, second image as the shadow should appear - shadow on the ground surface is fine):


So, I assume the transformation using the instance acceleration structure does not just transform the geometry vertices, but something additional that might not necessarily be part of my mental model. Why does the self-shadowing on the transformed center model appear to be out of hand? I mean, the light position hasn’t changed and is located at an angle of 45 degrees azimuth and 45 degrees elevation (so it’s coming from the upper left corner). Is the light source getting transformed for some reason, too?

Thank you

I assume you’ve correctly updated the instance acceleration with an optixAccelBuild or things wouldn’t look correct inside the raytraced image.

I’m also assuming the raytracing code is using the correct transform list to calculate the vertex attributes in world space inside the closest hit program.
With OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING that would be just a single matrix at the instance.
Positions get transformed as point, normals as normal applying the inverse transpose matrix, and optional tangents and bitangents as vectors.

So if that is all correct, the question is how you generate the shadow map for the OpenGL rasterizer.

If the shadow projection is correct with an identity transform and looks wrong when applying a rotation you’re most likely not taking the instance transformation into account when rendering the shadow map and work in the wrong coordinate space.

I’m also assuming the raytracing code is using the correct transform list to calculate the vertex attributes in world space inside the closest hit program.
With OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING that would be just a single matrix at the instance.
Positions get transformed as point, normals as normal applying the inverse transpose matrix, and optional tangents and bitangents as vectors.

Well, I assumed the acceleration structure hierarchy to be working a bit like scene graphs, and so transformations would be applied to “everything that is attached” to a transformation node, which is probably true for the acceleration structure itself, but then you mentioned the transformed stuff and I remembered that there’s basically a second set of data in the SBT. Though, in the meantime “OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING” also has changed, as I now have an instance acceleration structure with two instances, each holding one geometry acceleration structure (floor and center object) and a suitable transformation matrix (identity for the floor and a rotation around the y-axis for the center object).

So, I started changing my closest hit program using some snippets from https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/shaders/closesthit.cu and just ended up with an

[02][ERROR       ]: Validation mode caught builtin exception OPTIX_EXCEPTION_CODE_INVALID_RAY
Error recording resource event on user stream (CUDA error string: unspecified launch failure, CUDA error code: 719)

So, I was wondering what the function optixGetTransformListHandle assumes as index value (unfortunately no further desciption in the API documentation). Is it 0 for the transformation matrix of the first instance and 1 for the matrix of the second? And is there a preferred way to query this value? Right now, I added a value containing an index of the current model (either 0 for the ground or 1 for the center object) to the sbt_data.

EDIT:
I found out that the trouble-causing part of the code is the computation of the ray origin, previously computed as surface_position:

const glm::vec3 surface_position = (1.f - u - v) * sbt_data.vertex[index.x] + u * sbt_data.vertex[index.y] + v * sbt_data.vertex[index.z];

so, as you mentioned, positions would get transformed as point, I used this code to transform source_position:

__forceinline__ __device__ void get_transforms(int active_instance, float4* mat_w, float4* mat_q)
{
	OptixTraversableHandle handle = optixGetTransformListHandle(active_instance);

	const float4* tmp_W = optixGetInstanceTransformFromHandle(handle);
	const float4* tmp_Q = optixGetInstanceInverseTransformFromHandle(handle);

	for (int i = 0; i < 2; ++i)
	{
		mat_w[i] = tmp_W[i];
		mat_q[i] = tmp_Q[i];
	}
}

__forceinline__ __device__ float3 transform_point(const float4* m, const float3& v)
{
	return make_float3(m[0].x * v.x + m[0].y * v.y + m[0].z * v.z + m[0].w, m[1].x * v.x + m[1].y * v.y + m[1].z * v.z + m[1].w, m[2].x * v.x + m[2].y * v.y + m[2].z * v.z + m[2].w);
}

// --- how I get the transform matrices ---
float4 object_to_world[3];
float4 world_to_object[3];
get_transforms(sbt_data.mesh_id, object_to_world, world_to_object);
// ----------------------------------------

// --- how I try to convert the surface_position ---
float3 _a-to-c = transform_point(object_to_world, make_flt3(sbt_data.vertex[index.x-to-y]));

const glm::vec3 surface_position = (1.f - u - v) * make_vec3(_a) + u * make_vec3(_b) + v * make_vec3(_c);
// -------------------------------------------------

make_vec3 and make_flt3 are just functions to convert between float3 and glm::vec3 as I use glm for all the math so far… The issue seems to be that surface_position.z ends up as nan.

Thank you very much for your help
Markus

Well, I assumed the acceleration structure hierarchy to be working a bit like scene graphs, and so transformations would be applied to “everything that is attached” to a transformation node

No, the transform list, which is the ordered list of all transform matrices (instance, motion, static) along a path from the traversable handle used inside the optixTrace call down to the leaf GAS nodes inside the render graph are used to inverse-transform the ray into object coordinate space to do the ray-primitive intersection.

Note that the ray is in different coordinate spaces in the different program domains.
Raygen, closest hit and miss programs have it in world space. anyhit and intersection have it in object space.

The “world space” coordinate system is defined by the current transformation list.

The transformations can’t be applied automatically to your vertex attributes, because OptiX doesn’t know anything about the device code like the closest hit programs you’re providing.

So, I was wondering what the function optixGetTransformListHandle assumes as index value (unfortunately no further description in the API documentation).

https://raytracing-docs.nvidia.com/optix7/guide/index.html#device_side_functions#transform-list

Is it 0 for the transformation matrix of the first instance and 1 for the matrix of the second? And is there a preferred way to query this value? Right now, I added a value containing an index of the current model (either 0 for the ground or 1 for the center object) to the sbt_data.

No, there are two errors in your get_transforms() function.
You’re using the wrong transform list index and you didn’t copy the third rows of the matrices which is the reason for your NaN results.

OptixTraversableHandle handle = optixGetTransformListHandle(active_instance); // BUG

for (int i = 0; i < 2; ++i) // BUG

Maybe just use these:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_runtime/shaders/closesthit.cu#L46
or these which work with the handle:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo10/shaders/transform.h
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo10/shaders/brdf_diffuse.cu#L78

Each object will report its current transform list from root traversable (the handle used inside the optixTrace call) to the leaf node containing the hit primitive.
In a render graph using OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING there is only one instance matrix above each GAS, and that is the index 0 in the transform list.
As you see in my introduction example code, I have multiple instances with different transforms and always use the hardcoded index 0 to get the current transformation matrix and its inverse inside the closest hit program.

If you want to see how to concatenate the transform matrices inside a deeper hierarchy (when using multiple levels of IAS or motion or static transforms) in the OptiX render graph, have a look at the OptiX helper functions provided inside the optix_7_device_impl_transformations.h header.

I’m using that inside the intro_motion_blur example here:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/intro_motion_blur/shaders/closesthit.cu#L72

Of course that could also be used for the single instance case but would be overkill.
(If you wonder, the transform functions I use in my other examples are older than that helper header, which also implements the individual transforms for points, normals and vectors.)

Right, invalid ray exceptions normally happen when any of the ray origin, direction, t_min, t_max or time components are NaN or if t_min >= t_max or t:min < 0.0f.

Hello,

thank you for your help.

I was testing around with the index value for optixGetTransformListHandle(...) yesterday and had it set to 0, so all I had to change today was, as you have pointed it out: for (int i = 0; i < 2; ++i) // BUG. Yes, it’s a Bug. I missed the forest for the trees when I had a look earlier at the exact closesthit.cu#L46 file you pointed out. Everything immediately worked, when I changed the upper limit to 3:

Also, thank you for the link to the documentation of the transform list. When I googled the function name, I ended up at https://raytracing-docs.nvidia.com - Optix 7.6 - optixGetTransformListHandle and was a bit helpless from there.

Thank you again for your help. That solved my issue!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.