Hello,
I am working on setting up an OptiX program that will render a an array of depth values given some camera eye locations and an elevation map.
I have followed along with Ingo Wald’s optix7 course as a starting point. I have posted previously about some issues I was seeing in the depth image, but I think that has been resolved. I wanted to go ahead and try to setup multiple cameras.
Really what I want is to generate a bunch of rays from far above the surface and “sample” the depths at different locations. From my understanding of what I have read so far, I can do 1D, 2D, or 3D launch as long as the launch index are handled properly. I opted to try this in 3D as that is ultimately the shape of the array I will working with for camera positions.
I added this struct inside of my launchParams:
struct
{
vec3f vertical;
vec3f direction;
vec3f* cameraLocationsBuffer;
uint32_t* depthBuffer;
vec3i size;
} cameras;
I modified my RayGen as:
extern "C" __global__ void __raygen__renderTrajectoryElevations()
{
const uint3 idx = optixGetLaunchIndex();
const uint3 dim = optixGetLaunchDimensions();
const uint32_t camIndex = idx.z * dim.y * dim.x + idx.y * dim.x + idx.x;
// depth payload
uint32_t u0;
vec3f camPosition = optixLaunchParams.cameras.cameraLocationsBuffer[camIndex];
vec3f rayDir = optixLaunchParams.cameras.direction;
optixTrace(optixLaunchParams.traversable,
camPosition,
rayDir,
0.f, //t_min
1e3f, //t_max
0.0f, // rayTime
OptixVisibilityMask( 255 ),
OPTIX_RAY_FLAG_DISABLE_ANYHIT, // OPTIX_RAY_FLAG_NONE,
SURFACE_RAY_TYPE, // SBT Offset
RAY_TYPE_COUNT, // SBT Stride
SURFACE_RAY_TYPE, // miss SBT Index
u0);
// get depth from payload
const float depth = __uint_as_float( u0 );
// and write to frame buffer ...
optixLaunchParams.cameras.depthBuffer[camIndex] = depth;
}
And my closest hit as:
extern "C" __global__ void __closesthit__radiance()
{
// get depth
const float t = optixGetRayTmax();
optixSetPayload_0( __float_as_uint( t ) );
}
in my render() function I call launch like:
if (launchParams.trajectory_cameras.size.x == 0) return;
launchParamsBuffer.upload(&launchParams,1);
OPTIX_CHECK(optixLaunch(pipeline,
stream,
launchParamsBuffer.d_pointer(),
launchParamsBuffer.sizeInBytes,
&sbt,
launchParams.cameras.size.x,
launchParams.cameras.size.y,
launchParams.cameras.size.z));
CUDA_SYNC_CHECK();
There is also this resize function which I have modified, I am not sure if that will be important:
void SampleRenderer::resize_elevationDepthBuffer(const vec3i &newSize) {
trajDepthBuffer.resize(newSize.x*newSize.y*newSize.z*sizeof(uint32_t));
launchParams.cameras.size = newSize;
launchParams.cameras.depthBuffer = (uint32_t*)depthBuffer.d_pointer();
}
And in my header I have added some new CUDABuffers:
CUDABuffer cameraLocationsBuffer;
CUDABuffer depthBuffer;
And finally in the main I am trying to call my render program and get the illegal memory access:
Model *model = loadOBJ("../models/myscene_smooth.obj");
SampleRenderer sample(model);
// should I use BOOST here? vector<vector<double>>(4, vector<double>(5)));
std::vector<std::vector<std::vector<vec3f>>> camLocations(2,
std::vector<std::vector<vec3f>> (2,std::vector<vec3f> (2)));
const int K = camLocations.size();
const int M = camLocations[0].size();
const int N = camLocations[0][0].size();
// add some random camera locations to test with
for (int k=0; k<K; k++) {
for (int m=0; m<M; m++) {
for (int n=0; n<N; n++) {
camLocations[k][m][n] = vec3f(-23.0f*float(k+1),4.6f*float(m+1),30.7f*float(n+1));
}
}
}
// get number of cameras
const int numPoints = K*M*N;
// all cameras look down(at surface)
const vec3f camDirection = (0.0,0.0,-1.0f);
const vec3f camUp = (0.0,0.0,1.0f);
sample.launchParams.cameras.direction = camDirection;
sample.launchParams.cameras.vertical = camUp;
// allocate memory
sample.cameraLocationsBuffer.resize(numPoints*sizeof(vec3f));
sample.cameraLocationsBuffer.alloc_and_upload(camLocations);
sample.resize_elevationDepthBuffer(vec3i(2,2,2));
sample.render_trajectory_elevations(); // fails on this call
Any thoughts on where I have gone wrong? I am almost certain it is they way I am allocating the memory, but I guess it could be the way I am using in the RayGen program? The program fails in the call to CUDA_SYNC_CHECK() after the call to OptixLaunch. Are there any glaring issues with my approach? Can you provide any code snippets or pointers on how I can improve? I appreciate any and all feedback as I am really hoping to understand the hows and whys as well as best practices with OptiX/CUDA.
Thank you!
Benjamin