cudaFuncSetCacheConfig( Kernel1, cudaFuncCachePreferL1) No effect on shared memory


After using cudaFuncSetCacheConfig( Kernel1, cudaFuncCachePreferL1);
there is no effect on shared memory. It is still 48KB.
I understand we should use sm_20 architecture, that might give me the correct functionality.
How do we change this GPU Architecture in Visual Studio 2008?
I tried: Properties->CUDA Build Rule v 3.0-> general-> GPU Architecture
But in GPU Architecture there is no option for sm_20 . Its giving an error when I am typing “sm_20”:

   Property Value is not valid 

Any suggestions?

Ok I was able to change the setting to compile for sm_20 by going to custom build rules and selecting latest (4.1) rules./

But still I am not able to set the

cache to 48KB.