Thanks for your clear explanation!
I know what’s controversial here. In fact I’m trying to delete buffer#4 after building GAS or IAS. It seems that the data of buffer#2 are packed and copied to output GAS buffer or IAS buffer.
void buildAabbGas() {
CUdeviceptr d_aabb_buffer; //buffer #AABB_2
OptixBuildInput buildinput;
// Setup buildinputs
...
buildinput.aabbArray.aabbBuffers=&d_aabb_buffer;
// Alloc buffer for GAS build
CUdeviceptr output_buffer; //Buffer #AABB_4
CUdeviceptr temp_buffer; //Buffer #AABB_3
optixAccelBuild(context,0,&accel_options,&buildInput,1,temp_buffer,temp_buffer_SizeInBytes,
output_buffer,output_buffer_SizeInBytes,aabb_handle,nullptr,1);
cudaFree(reinterpret_cast<void*>(d_aabb_buffer));
cudaFree(reinterpret_cast<void*>(output_buffer));//Release Buffer #AABB_4
cudaFree(reinterpret_cast<void*>(temp_buffer));
}
void buildTriangleGas() {
CUdeviceptr d_tri_buffer; //buffer #TRI_1
OptixBuildInput buildinput;
//Set up the buildinput data
buildinput.triangleArray.vertexBuffers=&d_tri_buffer;
//Alloc buffer for GAS build
CUdeviceptr output_buffer; //Buffer #TRI_4
CUdeviceptr temp_buffer; //Buffer #TRI_3
optixAccelBuild(context,0,&accel_options,&buildInput,1,temp_buffer,temp_buffer_SizeInBytes,
output_buffer,output_buffer_SizeInBytes,tri_handle,nullptr,1);
cudaFree(reinterpret_cast<void*>(d_tri_buffer));
cudaFree(reinterpret_cast<void*>(output_buffer));//Release Buffer #TRI_4
cudaFree(reinterpret_cast<void*>(temp_buffer));
}
void buildIas() {
//Build GAS
buildAabbGas();
buildTriangleGas();
//Setup host OptixInstance
OptixInstance optix_instance[2];
optix_instance[0].traversableHandle = tri_handle;
optix_instance[1].traversableHandle = aabb_handle;
CUdeviceptr instances_buffer; //buffer #IAS_2
//copy host OptixInstance to deviceptr
...
//Setup buildinputs
OptixBuildInput instance_input;
instance_input.instanceArray=instances_buffer;
//Alloc buffer for IAS build
CUdeviceptr output_buffer; //buffer #IAS_4
CUdeviceptr temp_buffer; //buffer #IAS_3
optixAccelBuild(context,0,&accel_options,&instance_input,1,temp_buffer,temp_buffer_SizeInBytes,
output_buffer,output_buffer_SizeInBytes,ias_handle,nullptr,1);
cudaFree(reinterpret_cast<void*>(instances_buffer));
cudaFree(reinterpret_cast<void*>(output_buffer));//Release Buffer #IAS_4
cudaFree(reinterpret_cast<void*>(temp_buffer));
}
Now too many buffers are involved to build an IAS. Temp buffers and data buffers can be released after building the corresponding GAS or IAS. But I’m unknown about the time to release these output buffers. Several cases of my codes are presented here:
- The output buffer of AABB GAS can be released safely after building IAS in buildIas() function.
- The output buffer of IAS can be released after building IAS in buildIas() function.
- The output buffer of AABB IAS can not be released immediately after building GAS in buildAABBGas() function. Otherwise the aabb primitive will disapper.
- Releasing the output buffer of triangle GAS at the end of buildTriangleGas() or buildIas() is risky. The app can work well with a low number of vertices. Illegal memory access is encountered when I increased the number of vertices.
optixCutouts_exp provides a solution that all the output buffers are released in cleanupState() function. I’m wondering if we can make it earlier to save device memory usage.