Performance varies greatly with different nvcc compilers

my suggestion is to file a bug.