Loop unrolling not done? cannot deduce loop trip count

I have the following loop - see below - which is part of a B-spline interpolation algorithm. I’d like to see it completely unrolled, but nvcc complains it cannot deduce the loop trip count. It’s a fixed! number of 64, no special indexing is used and the offsets, everything could be calculated beforehand. Why is the compiler not doing what I want?

int3 offsets = make_int3(0);

#pragma unroll 64

for (int i = 0; i != 64; ++i) {

	float weight = weights[offsets.x].x * weights[offsets.y].y * weights[offsets.z].z;

	/* put the transform jacobians into the correct position */

	transformJacobians[i] = weight;

	/* compute the indices */

	nonZeroJacobianIndices[i] = sum((make_float3(offsets) + index) * gridOffsets);

	/* offsetToIndexTable calculation for correct position */

	if (++offsets.x == 4) {offsets.x = 0; ++offsets.y;}

	if (offsets.y   == 4) {offsets.y = 0; ++offsets.z;}

}

Have you tried ‘i < 64’ instead of ‘i != 64’?

No, but a loop just above using the same syntax just with #pragma unroll 8 does get unrolled…