This will take longer.
Let’s first analyze what you programmed now.
I’m only copying the questionable code lines plus new comments.
1.)
// NV What is going on here?!
// NV Why are you setting all ray origins to the same hardcoded point?
// NV Where inside the scene geometry is that?
// NV Note that I do not call these "camera positions" because then you wouldn't need a 3D array of them.
*p++ = vec3f(23.0f,
4.6f,
30.7f);
2.)
// NV Where inside the model is that relative to the hardcoded ray origins above?
const vec3f camAt = model->bounds.center();
3.)
// NV And what is going one here?
// NV Now you set the camera direction to the normalized vector from the hardcoded camera ray origin to the model bounds center for all rays.
// NV Since you set all ray origins to the same point anyway, you only reference the first to set a ray direction for all cameras. Then you wouldn't need the 3D ray origins and could put this into the launch parameters.
vec3f camDirection = normalize(camAt - camLocations[0]);
4.)
// NV Now where does the ray origin lie relative to the terrain?
// NV Where does the direction go relative to the terrain?
// NV And for good measure you normalize the already normalized vector again.
sample.launchParams.cameras.direction = normalize(camDirection);
5.)
// NV I'm assuming this resizes the float* distanceBuffer to CAMERA_XRES * CAMERA_YRES * NUM_CAMERAS elements receiving the distance results?
sample.resize(vec3i(CAMERA_XRES, CAMERA_YRES, NUM_CAMERAS));
Possible reasons for why this hardcoded ray doesn’t hit the terrain mesh:
a) The center of the model bounds is above the ray origin and if that is above the terrain, all your identical(!) rays are going upwards.
b) Similarly when the ray origin and model bounds at similar heights and the ray is going horizontally over the terrain.
c) Ray origin is below the terrain and the direction goes downward.
I was doing this previously in a mesh grid using interp2D.
If your elevation data is a regular grid of height values, then the height on each point on that grid can be determined with a linearly interpolated texture lookup in hardware.
This only works for the height-over-ground result you currently get when shooting rays straight down.
This does not work when the elevation data is not a regular grid but randomly sized triangles. (That would require to rasterize the terrain as triangles.)
This does not work when you want to render a camera projection from some single camera position, because then most rays wouldn’t shoot straight down to the terrain but at an angle and you need the closest hit which the ray tracer provides easily.
There are two things I was attempting to do. The depth/distance to surface was just the first part.
The second thing I want to do is just as you mention, render images from poses around the elevation map.In this first attempt I was trying to
a) see if I could use Optix to get the same results I was getting with interp2D,
There is no problem in implementing that with OptiX.
I could implement what you need in about an hour in my own example framework.
b) get a better understanding of allocating memory/buffers/etc… on the GPU.
Note that the example framework you’re looking at is actually trying to abstract all of the details from the user.
That is not what I would recommend when trying to learn the fundamentals of the OptiX and CUDA APIs required to implement arbitrary OptiX applications.
Eventually I want these camera poses to come from another application running on the GPU so I would hopefully pass them directly from that program and avoid the upload/download overhead.
This would include both the “interpolated distances” as well as the rendered images from various poses, I would do as much processing on the GPU as possible before downloading the final result.
Woah, that opens a completely different can of worms!
Read this thread describing some limitations and issues:
https://forums.developer.nvidia.com/t/long-shot-access-mesh-data-in-different-program-but-already-loaded-in-the-gpu/157117/2
https://forums.developer.nvidia.com/t/optixaccelrelocationinfo-data/167114/6
I would recommend trying other communication methods between processes on the host first.
Can you provide a little more information about this? You mean to say that we can’t render an image looking straight down?
No, of course you can a render images with arbitrary camera definitions.
I’m assuming you have read all links I provided before, explaining how a pinhole camera is setup with a position P and U,V,W vectors spanning a left-handed view frustum.
When defining the pinhole camera directly with the P,U,V,W data, any positioning and orientation is possible.
But there are different ways to define such camera with a position, lookat point, up-vector, and a field of view angle.
When using that second method, the up-vector is used to make the camera V vector upright. (This prevents rolling the camera around its forward axis during orbit operations, for example.)
For that to work the direction vector = (lookat - position) and up-vector must not be collinear because that is not spanning up a plane and then there exist infinitely many V vectors which would be perpendicular to the forward direction. That problem, resp. the reduction of the degree of freedom with the up-vector, is often described as gimbal lock and results in erratic camera behavior.
So in your case, when your ray directions are all (0, 0, -1) then the up-vector must not be (0, 0, 1). Just pick a different vector which is not collinear to the ray direction, like ((0, 1, 0), or (1, 0, 0) or any other non-collinear vector. Depends on how you want the camera plane to be oriented above your terrain.
Again, if you specify the P, U, V, W data of a standard pinhole camera directly, there should be no issue selecting the right vectors for a downward direction with W along (0, 0, -1) and the U and V vectors orthogonal to that.
Let’s assume the goal is to get a number of 2D single channel floating point data containing the distance to a camera position above some terrain.
Like when flying with a plane over some landscape and doing aerial photos.
Define a camera struct which fully describes the position and a projection, as simple as this.
struct PinholeCamera
{
float3 P;
float3 U;
float3 V;
float3 W;
};
Define a 1D vector of these camera structs and initialize them with the proper location and projection vectors.
Allocate and upload that to the GPU device and store the pointer to that device data and the number of entries inside the launch parameters.
(If all projection vectors U, V, W are the same for all cameras, there is no need to store them inside the camera struct but they could be put into the launch parameters and only the camera positions would need to be allocated and uploaded as 1D array.)
Allocate the target distance result buffer. If all 2D distance images should have the same resolution you can allocate a 3D array with XRES * YRES * number_of_cameras size, one float per element.
Addressing of the element as linear offset is the same as in your code already.
const uint3 idx = optixGetLaunchIndex();
const uint3 dim = optixGetLaunchDimensions();
const uint32_t distanceIndex = idx.z * dim.y * dim.x +
idx.y * dim.x +
idx.x;
Note that idx.z (which is the respective xy-2D image slice inside the 3D output buffer) is also the index into the camera array.
So inside the raygen program use idx.z to get the PinholeCamera from the arrays and calculate the resulting projection (ray direction) with the idx.x and idx.y values as shown in many OptiX examples for each pixel inside the 2D xy-slice of the output distances
Setup everything else for the pipeline and shader binding table as before.
Call optixLaunch with the dimension width = XRES, height = YRES, depth = number_of_cameras.
Download the distance data in xy-slices from the device and store them into images on disk or what you want to do with them.
A word of warning: As mentioned in the posted links before, the OptiX launch dimension is limited to 2^30.
So if you want, for example, render 1024x1024 sized images per camera, you can only render 1024 cameras at once (1024x1024x1024 == 2^30). That is 1 Giga rays!
When you want to render bigger images, you can only render fewer cameras.
If your image resolutions are too small to saturate a modern GPU (everything below 256x256 for example) then grouping multiple cameras into one launch like that is actually good.
It’s also totally reasonable to render only one camera first. When num_cameras == 1 the optixLaunch is actually a 2D launch automatically with no changes.
If you have additional coding question, please attach the full source code as file instead of posting only code excerpts from now on. It’s much simpler and faster to identify problems then.