I wrote a code that uses the entire shared memory available (16 KB) which involves double precision values. For compiling it I have used the following option -arch compute_13. This compiles without a problem. But the issue arises when I add the flag -code sm_13 with the previous one. Then it throws up the following error.
“ptxas error : Entry function ‘_Z16ldldecompositionPd’ uses too much shared data (0x4008 bytes + 0x10 bytes system, 0x4000 max)”
by the way, it did compile pretty cleanly with both the flags when I used lesser memory… But I still want to get to the bottom of it… So someone give me an idea please