To my surprise I’ve noticed that constant variables are being read through global memory, not through constant memory path on Fermi, i.e. when I compile with -arch=sm_20. Using options -arch=compute_13 -code=sm_20 fixes the problem and makes constant data fetched through constant mem path (and also improves the register usage almost twice).
Is this the intended behavior ??