Does anyone out there have a good guide as to how to decode the ptxinfo for Fermis? For example:
ptxas info : Function properties for progno_cloud
1000 bytes stack frame, 1192 bytes spill stores, 1776 bytes spill loads
*ptxas info : Used 63 registers, 48 bytes cmem[0], 8 bytes cmem[14], 280 bytes cmem[16]
Now, I’m pretty sure that “stack frame” is per-thread local memory use because in my kernels with lots of local memory, “stack frame” goes up. Likewise, I think the “spills” refer to spilling of registers into local memory. Again, my big kernels where I easily crash into the 63 register limit get spills.
But, I’m confused by the cmems. My first thought was that it refers to constant and shared memory use. Yet, when I have a kernel with lots of constant memory used:
ptxas info : Used 63 registers, 4+0 bytes lmem, 64 bytes cmem[0], 4 bytes cmem[14], 276 bytes cmem[16]
while a kernel with no constant memory used:
ptxas info : Used 28 registers, 120 bytes cmem[0], 5600 bytes cmem[2], 8 bytes cmem[16]
Now, I can say that latter code is one where I have 4 or 5 global kernels often sharing some “working space” global memory, i.e., there is a common array to all kernels that is allocated but never copied to/from the host. (Oh and neither of these uses shared memory!)
So: What does cmem refer to and what do the various [#] refer to? And is there something I should be careful of? Like, avoid cmem[14] but cheer cmem[16]?
Thanks,
Matt