Does anyone out there have a good guide as to how to decode the ptxinfo for Fermis? For example:
ptxas info : Function properties for progno_cloud 1000 bytes stack frame, 1192 bytes spill stores, 1776 bytes spill loads *ptxas info : Used 63 registers, 48 bytes cmem, 8 bytes cmem, 280 bytes cmem
Now, I’m pretty sure that “stack frame” is per-thread local memory use because in my kernels with lots of local memory, “stack frame” goes up. Likewise, I think the “spills” refer to spilling of registers into local memory. Again, my big kernels where I easily crash into the 63 register limit get spills.
But, I’m confused by the cmems. My first thought was that it refers to constant and shared memory use. Yet, when I have a kernel with lots of constant memory used:
ptxas info : Used 63 registers, 4+0 bytes lmem, 64 bytes cmem, 4 bytes cmem, 276 bytes cmem
while a kernel with no constant memory used:
ptxas info : Used 28 registers, 120 bytes cmem, 5600 bytes cmem, 8 bytes cmem
Now, I can say that latter code is one where I have 4 or 5 global kernels often sharing some “working space” global memory, i.e., there is a common array to all kernels that is allocated but never copied to/from the host. (Oh and neither of these uses shared memory!)
So: What does cmem refer to and what do the various [#] refer to? And is there something I should be careful of? Like, avoid cmem but cheer cmem?