A number of people on this forum have reported spilling issues with CUDA 7.5 and have dutifully filed bugs.
Itās a real pain to deal with this issue. Hopefully an updated Toolkit gets released soon.
But until we see a new release of the Toolkit, here are a couple observations and possible workaroundsā¦
One workaround that works some of the time is to make sure variables that are declared but might not be set are initialized with a default value.
Iāve made this change and in some instances the spills were removed.
There are two cases where Iāve seen this problem frequently appear:
- Initializing a variable with 1 thread and then broadcasting it to the rest of the warp:
int x; // <--- if not initialized sometimes results in spills
if (warp_lane_is_first())
x = atomicAdd(y,z); // or some other operation
x = __shfl(x,0); // broadcast lane 0 to rest of warp
This bug has been around a long time.
- Declaring several variables, initializing the first N out of M with loads from global or shared memory and then jumping to process the first N:
while (true)
{
int a,b,c,d,e,f; // <--- if not initialized on SM_5x + 7.5 RC results in spills
a = mem_ptr[offset+WARP_SIZE*0];
if (rem == 1)
goto process;
b = mem_ptr[offset+WARP_SIZE*1];
if (rem == 2)
goto process;
...
e = mem_ptr[offset+WARP_SIZE*6];
if (rem == 7)
goto process;
f = mem_ptr[offset+WARP_SIZE*7];
process:
// do something with a
if (rem == 1)
break;
...
// do something with f
if (rem == 8)
break;
offset += 8 * WARP_SIZE;
rem -= 8;
}
This Duff-like idiom is pretty common and if properly implemented doesnāt result in compiler warnings. Note that a switch statement generates similar SASS. [ When are we getting indirect branches? :) ]
Yet, on Maxwell + 7.5 RC Iām seeing spillage unless the declared variables are initialized. Although the initialization in (2) squelches spills the SASS shows unnecessary initializations.
- If youāre using āpointer + constant offsetā addressing and youāre compiling for a 64-bit target then double-check the SASS and verify that youāre seeing those offsets.
If youāre not, then make sure youāre using a signed integer offset. Otherwise, youāll see additional register pressure.
You should be seeing SASS like this:
LDG.E.64 R36, [R28+0x100];
LDG.E.64 R44, [R28+0x300];
LDG.E.64 R42, [R28+0x200];
LDG.E.64 R52, [R28+0x400];
Be safe out there. :)