Instance pointer build input

droettger · July 19, 2021, 12:52pm

Except for internal unit tests, I haven’t found a public example using OPTIX_BUILD_INPUT_TYPE_INSTANCE_POINTERS either.

This API reference explains the alignment requirements for OptixInstances and the arrays inside the build input:
https://raytracing-docs.nvidia.com/optix7/api/html/struct_optix_build_input_instance_array.html

CUdeviceptr OptixBuildInputInstanceArray::instances

If OptixBuildInput::type is OPTIX_BUILD_INPUT_TYPE_INSTANCE_POINTERS instances and aabbs should be interpreted as arrays of pointers instead of arrays of structs.
This pointer must be a multiple of OPTIX_INSTANCE_BYTE_ALIGNMENT if OptixBuildInput::type is OPTIX_BUILD_INPUT_TYPE_INSTANCES.
The array elements must be a multiple of OPTIX_INSTANCE_BYTE_ALIGNMENT if OptixBuildInput::type is OPTIX_BUILD_INPUT_TYPE_INSTANCE_POINTERS.

That is pretty clear about the alignment requirements. (With OPTIX_INSTANCE_BYTE_ALIGNMENT == 16ull):

When using an array of OptixInstances, then the device pointer to the array needs to be 16 byte aligned.
Since the OptixInstance struct is padded to an 80 bytes size manually, all OptixInstance elements in that array are 16 byte aligned.

If you’re using an array of pointers to OptixInstances, then each pointer in that array must point to a 16 byte aligned device address because the OptixInstance needs to be 16 byte aligned.

A CUdeviceptr itself is 64 bit and needs to be at 8 byte aligned.

Either alignment of the build input instances or instance pointer arrays shouldn’t be a problem when allocating the memory with cudaAlloc() or cuMemAlloc() which are at least 256 byte aligned.

So in your case you first need to make sure that the individual pointers to the OptixInstances are all aligned to 16 bytes.
Just add an assert((device_pointer & 15ull) == 0) to all your individual OptixInstance pointers in your build input array.
If that fires inside the debugger, you need to place the OptixInstance field in your own structures at a properly aligned offset and potentially pad your structure’s size.

Since the OptixInstance itself doesn’t have an __align__(OPTIX_INSTANCE_BYTE_ALIGNMENT)(which I think should have been added inside the OptiX SDK) that might have been placed at a misaligned offset in your structure for the first or later elements.
You can use that __align__ to let the compiler automatically place that in your own structures, but beware of additional padding inside the struct.
There are many examples inside the OptiX SDK examples which use that for the Shader Binding Table record structures.

My approach for device side structures is to order their fields by CUDA alignment restrictions from big to small and pad them manually to the largest alignment needed in a struct.
https://forums.developer.nvidia.com/t/preferred-alignment-for-buffers/107532/2
https://forums.developer.nvidia.com/t/optic-7-passing-multiple-ray-data-to-closesthit-program/160005/4
The compilers will normally handle the alignment for built-in types, but this also makes sure there is no inadvertent padding added between fields inside the structure to make them as small as possible.

Topic		Replies	Views
Does optixAccelComputeMemoryUsage require vertex/index buffer pointers to be filled? OptiX	3	651	June 14, 2022
CUDA error code 700, illegal memory access in call to optixAccelBuild OptiX	4	179	July 17, 2025
Optix 7.0 Instance build inputs not working, need example code OptiX	3	1583	June 14, 2022
Questions regarding optixAccelBuild and "an illegal memory access was encountered" OptiX	5	1476	June 15, 2022
Error when using pointers in Optix programs OptiX	2	883	June 14, 2022
Tracing Works As Expected With `optixBuildAccel` With Only One OptixBuildInput But Fails With Two OptiX ubuntu , lidar , ray-tracing	6	1157	May 5, 2023
Instance Acceleration Structure-OptiX 7.1 OptiX	3	1235	October 12, 2021
An problem about update acceleration structure? OptiX	3	223	June 27, 2024
optixGetPrimitiveIndex() return 0 when using mutiple OptixBuildInput OptiX	6	1083	October 12, 2021
Optix 4 and CUDA interop, new limitation with input/output buffers OptiX	15	4000	June 14, 2022

Instance pointer build input

Related topics