OptiX Prime Program Not Working

I have been working on a simple OptiX Prime application for the past few days that is very similar to the simplePrime example provided with the SDK. In fact, all of the actual OptiX Prime API calls are essentially the same. All of the API calls return RTP_SUCCESS. However, I’m not getting back any results in my hit buffers. Are there any caveats I should look out for in the OptiX Prime 3.5.1 SDK?

Some more information would be required to assess this.

  • Most fundamental thing first: Make sure the rays you’re shooting hit the model.
  • Do the OptiX Prime examples work on your configuration?
  • Do they work if you built them yourself?
  • If yes, I would recommend to single step inside the debugger through the OptiX Prime example matching yours most and check if exactly the same API calls are done in your code as well.
  • If that’s not helping, what exactly is your system configuration:
    OS version, bitness, installed GPU(s), display driver version, CUDA Toolkit version?
  1. Yes, I purposely set up a control test just now which had a single triangle and a ray going right through the center–and it didn’t work. Also, I have my ray generation/casting setup in another program/middleware, and it works with expected results. The is little I can think of that would be different between OptiX and this middleware. Perhaps the way triangles are constructed (indicies) and possibly transformations. However, I’m not transforming the model in the OptiX-based program, so world space and object space would match up, I’m assuming.

The program, as of know, just makes use of some simple cosine-weighted hemispheres on the surface of a triangle (on the normal). I did a straight port of my existing code that generates rays on the triangle in the hemisphere. OptiX even provides a utility method which is very similar to my cosine-weighted hemisphere ray implementation, so it’s nothing too extravagant. It definitely shouldn’t affect the raycasting.

  1. Yes, the pre-built examples work on my configuration.

  2. I have not tried building them myself. I will take a look at doing just that when I get the chance.

  3. Not exactly what you recommend, but I have gone through manually with the debugger to make sure all the rays/vertices/sample points are being created correctly–and they are.

  4. My main development machine doesn’t have an Nvidia GPU in it, sadly. It is running Windows 7 x64. However, I have a personal rig (from which I am away at the moment) with a GTX 770 running the same OS. Unfortunately, I will not have access to that machine for the next few weeks, so I will not be able to run most of the OptiX examples that require a CUDA-enabled device.

At this point, I’m assuming that one of two things are wrong. Firstly, I am possibly not structuring my data (buffers) correctly, leading to bad results. If this is the case, the data is probably breaking at the point where I build my mesh from a custom file format, as the rays/hits are simply optix::Ray’s and float’s, respectively. Do you have any documentation on how I should represent a mesh in OptiX? I’m using an array of integers that map to a vertex index in the vertices list in increments of 3. e.g. vertices[triangles[0]] is the first triangle’s first vertex. This seems to be pretty standard for meshes. Perhaps I should try converting the mesh to a flat triangle “soup”? That is, removing all indexed vertices and passing just vertices. This would be less than optimal, however.

Secondly, it could be that I’m not configuring my build settings correctly. However, my code is compiling and running just fine, so I doubt this is the case.

I appreciate the help very much!

In #5, your development machine is also the target machine? That is, you’re running the OptiX Prime examples on the CPU fallback?

If you are in doubt if your input buffers with the rays and the output buffers with the hit queries have the proper format, please have a look at the OptiX API Reference and search for the RTP_BUFFER_FORMAT_* defines.

There are only two ray input formats (directly from the OptiX API reference):
RTP_BUFFER_FORMAT_RAY_ORIGIN_DIRECTION float3:origin float3:direction
RTP_BUFFER_FORMAT_RAY_ORIGIN_TMIN_DIRECTION_TMAX float3:origin, float:tmin, float3:direction,
float:tmax

and four possible query results:
RTP_BUFFER_FORMAT_HIT_BITMASK one bit per ray 0=miss, 1=hit
RTP_BUFFER_FORMAT_HIT_T float:ray distance (t < 0 for miss)
RTP_BUFFER_FORMAT_HIT_T_TRIID float:ray distance (t < 0 for miss), int:triangle id
RTP_BUFFER_FORMAT_HIT_T_TRIID_U_V float:ray distance (t < 0 for miss), int:triangle id, float2-
:barycentric coordinates u,v (w=1-u-v)

Note that the input buffer is NOT taking optix::Ray types!
optix:Ray has an additional raytype field which doesn’t exist in OptiX Prime, so if you really did that, that would be your problem.
The simplePrime examples build their own “Ray” struct in simplePrimeCommon.h matching the RTP_BUFFER_FORMAT_RAY_ORIGIN_TMIN_DIRECTION_TMAX type

OptiX Prime supports both triangle arrays and indexed triangles.
If you mean vertices[triangles[1]] and vertices[triangles[2]] are the second and third vertex of one triangle, that maps exactly to what you would send to OptiX Prime, just as number of triangle times int3 indices and the whole vertex pool as float3 data.
Either method you described should work, so keep it indexed if your vertex reuse is good.

That is probably the issue–I’m using the optix::Ray type for my rays! And my development machine isn’t the sole target. I plan using this with some fairly powerful Nvidia GPUs. I just don’t have them available for my use at the moment. So, the CPU fallback is what I’m using for primary development at this point, yes.

Okay, I switched all my rays to a struct matching RTP_BUFFER_FORMAT_RAY_ORIGIN_DIRECTION:

struct Ray
{
	optix::float3 dir;
	optix::float3 origin;
};

Unfortunately, still no hits are being returned in the hits buffer. Here are the relevant bits:

...
	RTPcontext context;

	rtpContextCreate(RTP_CONTEXT_TYPE_CPU, &context);
	rtpContextSetCpuThreads(context, 8); // My dev machine has 8 cores
...
	RTPbufferdesc vertsBD;
	
        // "Vertices" is an array of float3's. 
        CHK_PRIME(rtpBufferDescCreate(context, RTP_BUFFER_FORMAT_VERTEX_FLOAT3, RTP_BUFFER_TYPE_HOST, vertices, &vertsBD));

	RTPbufferdesc trianglesBD;
	CHK_PRIME(rtpBufferDescCreate(context, RTP_BUFFER_FORMAT_INDICES_INT3, RTP_BUFFER_TYPE_HOST, triangles, &trianglesBD));
...
	RTPmodel sceneModel;
	CHK_PRIME(rtpModelCreate(context, &sceneModel));
	CHK_PRIME(rtpModelSetTriangles(sceneModel, trianglesBD, vertsBD));
	CHK_PRIME(rtpModelUpdate(sceneModel, RTP_MODEL_HINT_ASYNC));
...
	CHK_PRIME(rtpModelFinish(sceneModel)); 
...
	Ray *raysToSample = new Ray[numRays];
	float *hits = new float[numRays];

        // Populate raysToSample here... I also manually added a ray that is guaranteed to always hit the model

	RTPbufferdesc raysBD;
	RTPbufferdesc hitsBD;

	CHK_PRIME(rtpBufferDescCreate(context, RTP_BUFFER_FORMAT_RAY_ORIGIN_DIRECTION, RTP_BUFFER_TYPE_HOST, raysToSample, &raysBD));
	CHK_PRIME(rtpBufferDescCreate(context, RTP_BUFFER_FORMAT_HIT_T, RTP_BUFFER_TYPE_HOST, hits, &hitsBD));
...
	RTPquery query;
	CHK_PRIME(rtpQueryCreate(sceneModel, RTP_QUERY_TYPE_CLOSEST, &query));

	CHK_PRIME(rtpQuerySetRays(query, raysBD));
	CHK_PRIME(rtpQuerySetHits(query, hitsBD));

	CHK_PRIME(rtpQueryExecute(query, RTP_QUERY_HINT_ASYNC));
...
	CHK_PRIME(rtpQueryFinish(query));
...
	for (int h = 0; h < numRays; h++)
	{
	        float hit = hits[h];

		if (hit > 0.0f)
		{
                     // Ray hit
		}
	}

I would say those code excerpts should work, but it’s not possible to say without full reproducer.

If you say the simplePrime example works and yours doesn’t, we’re back at the steps:

  • Do they work if you built them yourself?
  • If yes, I would recommend to single step inside the debugger through the OptiX Prime example matching yours most and check if exactly the same API calls are done in your code as well.

Just glossing over the example code, the differences are that you set the number of CPU cores, do not call the buffer description range functions (defaults should do), call model build and query with ASYNC hints, and use different formats for the rays and queries.

Again, if simplePrime works when building it yourself I would do exactly the same things and just exchange the model data (with a single triangle and a single ray), only then I would start changing other things one by one. If any of these isolated changes lets the application stop producing hits and everything else is still correct, you could file a bug with a reproducer against OptiX Prime.

I think I may have found the issue… it might have something to do with Visual Studio changing the directory and not loading the binaries (DLLs) for OptiX. It’s surprising Visual Studio would launch the debugger without the binaries present though, as that doesn’t work in a standalone build… I’ll look further into this.

However, I managed to make a new project from scratch, hard coded a few triangles and rays, and finally got some hits returned back in the hit buffer.

Okay, I did a lot of work over the past couple of days, and it seems like OptiX Prime is simply returning bad hits. I wrote a small baseline test with hardcoded values for a single triangle and two rays (one hit and one miss). However, this baseline test is not returning the expected values. Here’s the code (sorry for the messy code):

void BaselineTest(const RTPcontext &context, RTPbuffertype bufferType)
{
	float3 vertices[3];

	vertices[0] = make_float3(0.0f, 0.0f, 0.0f);
	vertices[1] = make_float3(2.0f, 2.0f, 0.0f);
	vertices[2] = make_float3(2.0f, 0.0f, 0.0f);

	RTPbufferdesc verticesDesc;

	CHK_PRIME( rtpBufferDescCreate(
		context,
		RTP_BUFFER_FORMAT_VERTEX_FLOAT3,
		bufferType,
		vertices, 
		&verticesDesc )
		);

	RTPmodel model;
	CHK_PRIME( rtpModelCreate( context, &model ) );
	CHK_PRIME( rtpModelSetTriangles( model, 0, verticesDesc ) );
	CHK_PRIME( rtpModelUpdate(model, 0) );

	const int rayCount = 2;

	RTPbufferdesc raysDesc;
	Ray *raysBuffer = new Ray[rayCount];

	populateRays(raysBuffer);

	CHK_PRIME( rtpBufferDescCreate( 
		context, 
		RTP_BUFFER_FORMAT_RAY_ORIGIN_DIRECTION,
		bufferType, 
		raysBuffer, 
		&raysDesc )
		);

	RTPbufferdesc hitsDesc;
	Hit *hitsBuffer = new Hit[rayCount];

	CHK_PRIME( rtpBufferDescCreate( 
		context, 
		RTP_BUFFER_FORMAT_HIT_T,
		bufferType, 
		hitsBuffer, 
		&hitsDesc )
		);

	RTPquery query;
	CHK_PRIME( rtpQueryCreate( model, RTP_QUERY_TYPE_CLOSEST, &query ) );
	CHK_PRIME( rtpQuerySetRays( query, raysDesc ) );
	CHK_PRIME( rtpQuerySetHits( query, hitsDesc ) );
	CHK_PRIME( rtpQueryExecute( query, 0 /* hints */ ) );

	std::cout << std::endl << "Baseline test results: " << std::endl;

	// Parse hits
	for (int i = 0; i < rayCount; i++)
	{
		Hit hit = hitsBuffer[i];
		float distance = hit.t;

		std::cout << "Hit " << i + 1 << ": " << distance << std::endl;
	}

	std::cout << std::endl;

	CHK_PRIME( rtpModelDestroy(model) );

	delete[] raysBuffer;
	delete[] hitsBuffer;
}

And here is the populateRays function:

void populateRays(Ray *raysBuffer)
{
	Ray* rays = raysBuffer;

	Ray ray1; // Hit
	ray1.origin = make_float3(1.0f, 0.2f, 1.0f);
	ray1.dir = make_float3(0.0f, 0.0f, -1.0f);

	Ray ray2; // Miss
	ray2.origin = make_float3(-1.0f, 0.2f, 1.0f);
	ray2.dir = make_float3(0.0f, 0.0f, -1.0f); 

	rays[0] = ray1;
	rays[1] = ray2;
}

When I run this, I would expect a hit with a distance of 2 from the first ray, and a miss from the second ray. However, both hits come back as extremely small values near zero, e.g. 1.15714e-038. Here are pictures of both rays:

Ray 1: http://www.wolframalpha.com/share/clip?f=d41d8cd98f00b204e9800998ecf8427ehjkqalatr4
Ray 2: http://www.wolframalpha.com/share/clip?f=d41d8cd98f00b204e9800998ecf8427espf36m93ft

Any idea why these would not be returning properly?

By the way, I switched to the Optix 3.6.0 beta SDK.

The code looks right. Have you tried using the CPU context? I assume the new beta didn’t make a difference.

Yes, this is using the CPU context. I also tried with the GPU context; neither return accurate results.

Your example is small enough - can you post a complete program that gives you bad results and we will give it try. Or least post the code that calls the code you already posted (creation of the context etc.) so we have an exact repro.

I ran into a similar problem as described here. Whats missing in the above code excerpt is passing the size of the vertex-, ray-, and hit buffer to be used by the optix query. i.e. add CHK_PRIME( rtpBufferDescSetRange( raysDesc, 0, rayCount) ) after the initialization of raysDesc, and similarly for the vertex and hit buffer.

Hello all,

After receiving advice from Detlef on another thread, I began to thrash out some very primitive OptixPrime code based on the SDK primeSimple sample.
I too ran into the same issues of ‘no hits’…
If I couldn’t get this running, there was no real point in going further…
I used IvJet’s supplied code as the baseline again, and also took karlason’s advice regarding the rtpBufferDescSetRange additional code snippet, but still no joy…

I reverted to the original primeSimple.cpp code and merged accordingly, re-writing where necessary and lo-and-behold I got it to work…
It may not be close to the most streamlined code ever written, but I do now get console output listing Ray1 as a hit and Ray2 as a miss - as expected from the suggested geometry…

I am not bragging in any way, I am only putting this up for anyone else who is starting in the fundamentals of Optix and Prime (like me)…

I hope this helps…(and apologies if the code format does not work properly)
Below in the main element of the re-written simplePrime.cpp…

//------------------------------------------------------------------------------
int main( int argc, char** argv )
{
  // set defaults
  RTPcontexttype contextType = RTP_CONTEXT_TYPE_CPU;
  RTPbuffertype bufferType = RTP_BUFFER_TYPE_HOST;

  // parse arguments
  for ( int i = 1; i < argc; ++i ) 
  { 
    std::string arg( argv[i] );
    if( arg == "-h" || arg == "--help" ) 
    {
      printUsageAndExit( argv[0] ); 
    } 
    else if( (arg == "-o" || arg == "--obj") && i+1 < argc ) 
    {
      objFilename = argv[++i];
    } 
    else if( ( arg == "-c" || arg == "--context" ) && i+1 < argc )
    {
      std::string param( argv[++i] );
      if( param == "cpu" )
        contextType = RTP_CONTEXT_TYPE_CPU;
      else if( param == "cuda" )
        contextType = RTP_CONTEXT_TYPE_CUDA;
      else
        printUsageAndExit( argv[0] );
    } 
    else if( ( arg == "-b" || arg == "--buffer" ) && i+1 < argc )
    {
      std::string param( argv[++i] );
      if( param == "host" )
        bufferType = RTP_BUFFER_TYPE_HOST;
      else if( param == "cuda" )
        bufferType = RTP_BUFFER_TYPE_CUDA_LINEAR;
      else
        printUsageAndExit( argv[0] );
    } 
    else if( (arg == "-w" || arg == "--width") && i+1 < argc ) 
    {
      width = atoi(argv[++i]);
    } 
    else 
    {
      std::cerr << "Bad option: '" << arg << "'" << std::endl;
      printUsageAndExit( argv[0] );
    }
  }


  //
  // Create Prime context
  //
  RTPcontext context;
  CHK_PRIME( rtpContextCreate( contextType, &context ) );

  float3 vertices[3];

  vertices[0] = make_float3(0.0f, 0.0f, 0.0f);
  vertices[1] = make_float3(2.0f, 2.0f, 0.0f);
  vertices[2] = make_float3(2.0f, 0.0f, 0.0f);

  //
  // Create buffers for geometry data 
  //
 
  RTPbufferdesc verticesDesc;
  CHK_PRIME( rtpBufferDescCreate(
        context,
        RTP_BUFFER_FORMAT_VERTEX_FLOAT3,
        RTP_BUFFER_TYPE_HOST,
        vertices, 
        &verticesDesc )
      );

  CHK_PRIME(rtpBufferDescSetRange(verticesDesc, 0, 3));

  //
  // Create the Model object
  //
  RTPmodel model;
  CHK_PRIME(rtpModelCreate(context, &model));
  CHK_PRIME(rtpModelSetTriangles(model, 0, verticesDesc));
  CHK_PRIME(rtpModelUpdate(model, 0));

  
  //
  // Create buffer for ray input 
  //
  const int rayCount = 2;

  //RTPbufferdesc raysDesc;
  //Ray *raysBuffer = new Ray[rayCount];
  
    RTPbufferdesc raysDesc;
  Buffer<Ray> raysBuffer( 0, bufferType, LOCKED ); 

  raysBuffer.alloc(rayCount);

  if (raysBuffer.type() == RTP_BUFFER_TYPE_HOST)
  {
	  Ray* rays = raysBuffer.ptr();

	  /*RTP_BUFFER_FORMAT_RAY_ORIGIN_TMIN_DIRECTION_TMAX*/
	  Ray r1 = { make_float3(1.0f, 0.2f, 1.0f), 0.1f, make_float3(0.0f, 0.0f, -1.0f), 10.0f };
	  rays[0] = r1;

	  Ray r2 = { make_float3(-1.0f, 0.2f, 1.0f), 0.1f, make_float3(0.0f, 0.0f, -1.0f), 10.0f };
	  rays[1] = r2;
  }


  CHK_PRIME( rtpBufferDescCreate( 
        context, 
        Ray::format, /*RTP_BUFFER_FORMAT_RAY_ORIGIN_TMIN_DIRECTION_TMAX*/ 
        raysBuffer.type(), 
        raysBuffer.ptr(), 
        &raysDesc )
      );

 CHK_PRIME( rtpBufferDescSetRange( raysDesc, 0, raysBuffer.count() ) );

  
  //
  // Create buffer for returned hit descriptions
  //
  RTPbufferdesc hitsDesc;
  Buffer<Hit> hitsBuffer( raysBuffer.count(), bufferType, LOCKED );
  CHK_PRIME( rtpBufferDescCreate( 
        context, 
        Hit::format, /*RTP_BUFFER_FORMAT_HIT_T_TRIID_U_V*/ 
        hitsBuffer.type(), 
        hitsBuffer.ptr(), 
        &hitsDesc )
      );

  CHK_PRIME( rtpBufferDescSetRange( hitsDesc, 0, hitsBuffer.count() ) );

  //
  // Execute query
  //
 RTPquery query;
  CHK_PRIME( rtpQueryCreate( model, RTP_QUERY_TYPE_CLOSEST, &query ) );
  CHK_PRIME( rtpQuerySetRays( query, raysDesc ) );
  CHK_PRIME( rtpQuerySetHits( query, hitsDesc ) );
  CHK_PRIME( rtpQueryExecute( query, 0 /* hints */ ) );

  std::cout << std::endl << "Baseline test results: " << std::endl;

  // Parse hits

  Hit* hits = hitsBuffer.ptr();

  for (int i = 0; i < rayCount; i++)
  {
	  Hit hit1 = hits[i];
	  float distance1 = hit1.t;
	  std::cout << "Ray Distance " << i + 1 << ": " << distance1 << std::endl;
	  
  }

  std::cout << std::endl;

  //CHK_PRIME(rtpModelDestroy(model));

  
  //
  // cleanup
  //
  CHK_PRIME( rtpContextDestroy( context ) );
}