I got a mysterious behaviour while modifying the architecture in Build rule in VS2008. I am using a GTX285 (architecture 1.3) and execution is twice longer using sm_13 than using sm_11. It should be better when selecting the right architecture.
The kernels perform mainly texture hits and global store with few computations.
Does somebody get an idea on where it comes from?