2.0 beta 2 compiler bugs

FullyArticulate · July 1, 2008, 1:04am

My code compiled, linked, and ran just fine with 2.0beta1, but 2.0beta2 has a number of problems.

Host code can no longer use VLAs.

__host__ int

run_swarm(float *metaArray, int elements)

{

  float signals[elements];

  ...

  Swarm<<< gridsz, blocksz >>>();

}

swarm.cu:433: undefined reference to `__vla_alloc’

(line 433 is the float signals[elements] line)

The compiler is confused by cmem (is cmem constant mem?) My makefile calls nvcc, nvcc -cubin, nvcc -ptx, each time with --ptxas-options=-v. Here are the results:

nvcc -c swarm.cu ...

ptxas info: Compiling entry function _X5SwarmP5swarmP12swarm_memberPfi

ptxas info: 0 bytes lmem, 80 bytes smem, -1106984752 bytes cmem, 24 registers

nvcc -cubin swarm.cu ...

ptxas info: Compiling entry function _X5SwarmP5swarmP12swarm_memberPfi

ptxas info: 0 bytes lmem, 80 bytes smem, 1436764880 bytes cmem, 24 registers

nvcc -ptx swarm.cu ...

ptxas info: Compiling entry function _X5SwarmP5swarmP12swarm_memberPfi

ptxas info: 0 bytes lmem, 80 bytes smem, 367374048 bytes cmem, 24 registers

If I run it again, all the cmem bytes are different, random (and probably wrong).

Simon_Green · July 1, 2008, 11:32am

Thanks for reporting these.

1 - variable length arrays are a C99/C++ feature which I’m not sure is supposed to be supported in CUDA host code. I agree it’s bad if support for this has changed. Anyway, under Windows I get an error “expression must have a constant value” for this code, are you running Linux?

This is a known bug which should be fixed soon.

FullyArticulate · July 1, 2008, 5:51pm

Hi Simon,

Fedora Core 8 in 64-bit mode. This exact code compiled and ran fine with the beta 1 toolkit, so I’m guessing something changed. Easy to work around with an alloca(), but kind of annoying.

Thanks!