Compiler Bug: Assertion failure at line 1923 of ../../be/cg/cgemit.cxx incorrect reg

The compiler gave me some troubles, so I reduced it to a small repro problem:

__global__ void bug_kernel(long* vs) {

	long mask = 1;

	long v = threadIdx.x ? mask : 0;

	#pragma unroll

	for(int i = 1; i < 64; i++)

		v ^= mask << i;

	vs[threadIdx.x] = v;

}

For which the compiler outputs the error:

### Assertion failure at line 1923 of ../../be/cg/cgemit.cxx:

### Compiler Error in file /tmp/tmpxft_000058b4_00000000-7_bug.cpp3.i during Assembly phase:

### incorrect register class for operand 0

nvopencc INTERNAL ERROR: /usr/local/cuda/open64/lib//be returned non-zero status 1

Using CUDA 4.0 Production Release.

The compiler gave me some troubles, so I reduced it to a small repro problem:

__global__ void bug_kernel(long* vs) {

	long mask = 1;

	long v = threadIdx.x ? mask : 0;

	#pragma unroll

	for(int i = 1; i < 64; i++)

		v ^= mask << i;

	vs[threadIdx.x] = v;

}

For which the compiler outputs the error:

### Assertion failure at line 1923 of ../../be/cg/cgemit.cxx:

### Compiler Error in file /tmp/tmpxft_000058b4_00000000-7_bug.cpp3.i during Assembly phase:

### incorrect register class for operand 0

nvopencc INTERNAL ERROR: /usr/local/cuda/open64/lib//be returned non-zero status 1

Using CUDA 4.0 Production Release.

This kind of error message indicates an internal compiler bug. Since you already have a repro case in hand, please file a bug against the compiler. The compiler team may be able to suggest a workaround once they determine the underlying cause. Sorry for the inconvenience, and thank you for your help.

This kind of error message indicates an internal compiler bug. Since you already have a repro case in hand, please file a bug against the compiler. The compiler team may be able to suggest a workaround once they determine the underlying cause. Sorry for the inconvenience, and thank you for your help.