Fermi Shared Memory


I am trying to optimize a code for Fermi. When i try to allocate more than 16 KB of shared memory, the compiler throws the following error

"ptxas error   : Entry function  uses too much shared data "

I have configured the kernel to use 48 Kb of shared memory and using nvcc compiler version 3.1 in a Fermi hardware.

Please advice if we can use more than 16KB in our kernel? or if its a restriction?

You need to pass the flag -arch sm_20