I am using the primeSimple example to simulate intersecting a single triangle with a single ray.

Here is my triangle :

```
int nVertex = 3;
float* vertex_data = new float[nVertex * 3];
int iVertex = 0;
vertex_data[iVertex++] = 0.0f; vertex_data[iVertex++] = 0.0f; vertex_data[iVertex++] = 0.0f;
vertex_data[iVertex++] = 1.0f; vertex_data[iVertex++] = 0.0f; vertex_data[iVertex++] = 0.0f;
vertex_data[iVertex++] = 0.0f; vertex_data[iVertex++] = 1.0f; vertex_data[iVertex++] = 0.0f;
int nTriangles = 1;
int* vertex_indices = new int[nTriangles * 3];
int iTriangleVertex = 0;
vertex_indices[iTriangleVertex++] = 0; vertex_indices[iTriangleVertex++] = 1; vertex_indices[iTriangleVertex++] = 2;
```

Here is my ray CPU code :

```
Ray* rays = raysBuffer.ptr();
float minValueOrMask = 0.0f;
Ray r = {make_float3(0.2f, 0.1f, 3.0f), minValueOrMask, make_float3(0.0f, 0.0f, -1.0f), 1e34f};
rays[0] = r;
```

Here is my ray GPU kernel code :

```
__global__ void createRaysKernel
(
float4* rays,
int useMasking
)
{
int idGlobalThreadx = threadIdx.x + blockIdx.x * blockDim.x;
int idGlobalThready = threadIdx.y + blockIdx.y * blockDim.y;
float minValueOrMask = 0.0f;
if (useMasking)
{
minValueOrMask = __int_as_float(1);
}
if (idGlobalThreadx > 0 || idGlobalThready > 0)
return;
int iRay = 0;
// origin, tmin
rays[2 * iRay + 0] = make_float4(0.2f, 0.1f, 3.0f, minValueOrMask);
// dir, tmax
rays[2 * iRay + 1] = make_float4(0.0f, 0.0f, -1.0f, 1e34f);
}
```

The hit results for using CPU (correct) :

```
i = 0;
hits[i].t = 3
hits[i].triId = 0
hits[i].u = 7.0000e-01
hits[i].v = 2.0000e-01
```

The hit results for using GPU (not correct) :

```
i = 0;
hits[i].t = 0
hits[i].triId = 0
hits[i].u = 7.0000e-01
hits[i].v = 2.0000e-01
```

The hits[i].t = 0 value for GPU is not correct. It should be 3.0, same as CPU.

Am I doing something incorrect?

ExampleForOptixDevelopers.tar.gz (2.77 KB)