According to specification of the C1060, it has 16KB of shared memory. I has a kernel that need about 9728 byte of shared memory.
I am using cuda driver API and unfortunately the program die with CUDA_ERROR_INVALID_VALUE on cuFuncSetSharedSize(…), if i reduce the ammount of memory from 9728 to 4096 the call succed without problems, also kernel execution with wrong shared memory size do not give any error. i has also tried on Fermi devices (S2050, C2050, GT430), with 9728 byte, on these cuFuncSetSharedSize(), do not give any error
I am missing something ?