22.5% performance degradation after moving from CUDA 3.1 to 3.2 What can be done to try to make the

Such a huge degradation makes me think that there is something that can be tuned to make 3.2 work better. Nvidia guys, please respond!

Can you please add some more info on the application and the HW config?