My code compiled, linked, and ran just fine with 2.0beta1, but 2.0beta2 has a number of problems.
- Host code can no longer use VLAs.
__host__ int
run_swarm(float *metaArray, int elements)
{
float signals[elements];
...
Swarm<<< gridsz, blocksz >>>();
}
swarm.cu:433: undefined reference to `__vla_alloc’
(line 433 is the float signals[elements] line)
- The compiler is confused by cmem (is cmem constant mem?) My makefile calls nvcc, nvcc -cubin, nvcc -ptx, each time with --ptxas-options=-v. Here are the results:
nvcc -c swarm.cu ...
ptxas info: Compiling entry function _X5SwarmP5swarmP12swarm_memberPfi
ptxas info: 0 bytes lmem, 80 bytes smem, -1106984752 bytes cmem, 24 registers
nvcc -cubin swarm.cu ...
ptxas info: Compiling entry function _X5SwarmP5swarmP12swarm_memberPfi
ptxas info: 0 bytes lmem, 80 bytes smem, 1436764880 bytes cmem, 24 registers
nvcc -ptx swarm.cu ...
ptxas info: Compiling entry function _X5SwarmP5swarmP12swarm_memberPfi
ptxas info: 0 bytes lmem, 80 bytes smem, 367374048 bytes cmem, 24 registers
If I run it again, all the cmem bytes are different, random (and probably wrong).