Hello everyone, I try to perform Baked Ambient Occlusion using OptiX on a triangular surface.
To achieve this goal, I generate numberOfRays = numberOfSamples * numberOfVertices.
My hardware is AMD Ryzen 5 3600, GTX 1070.
I use OptiX 7.2, NVCC = 11.2, GCC 10.2.0.
The following code launches the pipeline
void launchAoPipeline(AoState& aoState)
{
CUDA_CHECK(cudaMalloc(reinterpret_cast<void**>(&aoState.d_params), sizeof(Params)));
CUDA_CHECK(cudaMemcpy(reinterpret_cast<void*>(aoState.d_params), &aoState.params, sizeof(Params),
cudaMemcpyHostToDevice));
unsigned int numberOfRays = aoState.params.mesh.numVertices * aoState.params.aoSamples;
std::cout << "\n\n\n\n" << numberOfRays << "\n\n\n\n";
// Launch pipeline
OPTIX_CHECK(optixLaunch(aoState.pipeline, nullptr /*default stream*/,
reinterpret_cast<CUdeviceptr>(aoState.d_params), sizeof(Params), &aoState.sbt,
numberOfRays, 1, 1));
CUDA_SYNC_CHECK();
}
The following code are my kernels.
extern "C"
{
__constant__ Params params;
}
extern "C" __global__ void __raygen__ao()
{
// Lookup location in the launch grid
const unsigned int vertexId = optixGetLaunchIndex().x / params.aoSamples;
const unsigned int rayId = optixGetLaunchIndex().x % params.aoSamples;
// The origin of the ray is the location of the current vertex
float3& rayOrigin = params.mesh.vertices[vertexId];
float3& normal = params.mesh.normals[vertexId];
float3 rayDirection = params.rayDirections[rayId];
if (dot(rayDirection, normal) < 0)
{
// reverse ray
rayDirection *= -1;
}
// Cast ray
optixTrace(params.gasHandle, rayOrigin, rayDirection, 0.0f, 1e16f, 0.0f, OptixVisibilityMask(255),
OPTIX_RAY_FLAG_TERMINATE_ON_FIRST_HIT, 0, 0, 0);
}
extern "C" __global__ void __closesthit__ao()
{
const unsigned int vertexId = optixGetLaunchIndex().x / params.aoSamples;
atomicAdd(¶ms.mesh.rayHits[vertexId], 1);
}
The surface that I am testing has:
Surface vertices: 17,567,820
Surface triangles: 35,148,720
If I have 32 samples, numberOfRays = 562,170,240, everything runs just fine.
If I have 64 samples, numberOfRays = 1,124,340,480, i get the following error:
[ 2][ ERROR]: Error launching work to RTX
terminate called after throwing an instance of 'sutil::Exception'
what(): OPTIX_ERROR_LAUNCH_FAILURE: Optix call 'optixLaunch(aoState.pipeline, nullptr , reinterpret_cast<CUdeviceptr>(aoState.d_params), sizeof(Params), &aoState.sbt, numberOfRays, 1, 1)'
Any ideas why this might be happening? Is there some kind of limit? Do I need to split the pipeline for big data knowing that limit?