Simple code crashes nvopencc be.exe crashes with stack overflow or out of memory

typedef struct state {
int val;
struct state *ptr;
} state_t;

device state_t states = {
{0, &states[1]},
{0, &states[0]}
};

Having this in my code causes be.exe to stack overflow. I worked around by replacing the ptrs with int offsets, but now be.exe runs out of memory and returns an error “### Out of memory in Allocate_Block”. I don’t know if it is related to this struct or just because it is a complex kernel. Either way, I don’t have a workaround for it.

This is using 4.0RC2. Using 3.2 doesn’t help.

Please file a bug, attaching a self-contained repro case. Thanks!

More info…

be.exe starts eating up gigs of memory with a single call to a one-line function. Inline that function and it eats almost no memory at all. If that inline function calls another simple one-line function, inline or not, it eats gigs of memory and dies again.

The problem is that these simple one-line functions write a single zero to an array declared as a local array in the topmost function. The address of the array is passed down and something can’t handle it.

Switch to malloc instead of local array and all is well.

I suspect the array (pretty big, 66x66 ints) is getting optimized into shared or registers and the whole buildchain can’t deal with it.

Nvidia, this is a pretty glaring bug for a release candidate, please fix!

Please file a bug with a repro case so our compiler team can have a look at this. Please also note whether this is a regression versus the r3.2 toolchain (i.e. code compiles fine with the r3.2 compiler, but encounters problems when the r4.0 compiler is used). Thank you for your help.