Hello, everybody!
I’ll show the relevant code and state the problem later. The snippet is part of a device function. Several threads are running the code in parallel.
GPUEdge edge0 = d_triEdges[currentTest.x];
GPUEdge edge1 = d_triEdges[currentTest.y];
unsigned int aIdx, bIdx , cIdx , dIdx;
aIdx = edge0.v.x;
bIdx = edge0.v.y;
cIdx = edge1.v.x;
dIdx = edge1.v.y;
float3 a = d_GL_vertices [aIdx].v;
float3 av = d_velocities [aIdx].v;
float3 b = d_GL_vertices [bIdx].v;
float3 bv = d_velocities [bIdx].v;
float3 c = d_GL_vertices [cIdx].v;
float3 cv = d_velocities [cIdx].v;
float3 d = d_GL_vertices [dIdx].v;
float3 dv = d_velocities [dIdx].v;
float fFirstTimeOfContact = Intersect_EE(a , b , c , d , av , bv , cv , dv , deltaTime);
unsigned int uiFirstTimeOfContact;
floatToSortableInt(fFirstTimeOfContact , uiFirstTimeOfContact);
atomicMin(&d_timeOfContact[aIdx] , uiFirstTimeOfContact);
atomicMin(&d_timeOfContact[bIdx] , uiFirstTimeOfContact);
atomicMin(&d_timeOfContact[cIdx] , uiFirstTimeOfContact);
atomicMin(&d_timeOfContact[dIdx] , uiFirstTimeOfContact);
This code tests whether two moving edges intersect (continuous collision). The edges are defined by their initial positions (d_GL_vertices)and velocities (d_velocities). The function Intersect_EE solves a generated cubic equation that defines the first time of contact of the edges. The atomicMins maintain the minimum time for each vertex.
My problem is that the vertex indices (aIdx, bIdx, cIdx and dIdx) are being corrupted. The corrupted values show after Intersect_EE function returns in a Nsight session. For example, when I debug code for threadIdx(6,0,0) the values for aIdx , bIdx, cIdx and dIdx are 3, 4, 0 and 1 before Intersect_EE and 1048664806, 1048664806, 1050784160, 1050784160 after. Its very weird behavior since they are local variables and the Intersect_EE function receives only const parameters as the next snippet shows.
__device__ __inline__ float
Intersect_EE(const float3 &ta0, const float3 &tb0, const float3 &tc0, const float3 &td0,
const float3 &tav, const float3 &tbv, const float3 &tcv, const float3 &tdv,
/*float3 &qi ,*/ const float deltaTime)
{
/* Default value returned if no collision occurs */
float collisionTime = -1.0f;
/*
* Compute scalar coefficients by evaluating dot and cross-products.
*/
float a, b, c, d; /* cubic polynomial coefficients */
_equateCubic_EE(ta0, tav, tb0, tbv, tc0, tcv, td0, tdv, a, b, c, d);
if (IsZero(a) && IsZero(b) && IsZero(c) && IsZero(d))
return collisionTime;
NewtonCheckData data;
data[0] = tav , data[1] = ta0;
data[2] = tbv , data[3] = tb0;
data[4] = tcv , data[5] = tc0;
data[6] = tdv , data[7] = td0;
/*
* iteratively solve the cubic (scalar) equation and test for validity of the solution.
*/
float l = 0;
float r = deltaTime;
float coeffs[4];
coeffs[0] = a, coeffs[1] = b, coeffs[2] = c, coeffs[3] = d; //changed the coeffs order
float3 pab;
if (solveCubic(l, r, false, data, coeffs)) {
collisionTime = (l+r)*0.5f;
}
return collisionTime;
}
The configuration I’m using is:
SO: Windows 7
CUDA Toolkit and SDK 4.0
Visual Studio 2010
Thanks in advance.