How do you tell nvcc to provide readable text assembly output so that I can analyze its register and instruction usage? Or if thats not available, just some statistics about register usage and what the compiler has done on my behalf? For the life of me, I cannot find how to do this.
To view statistics you can generate .cubin file (nvcc -cubin) and it will contain information on resources usage (registers, shared memory, local memory). There’s also some command0line switch in 1.1 which instructs nvcc to output resources usage information to the console.
To inspect generated code you can use excellent tool named decuda which can disassemble .cubin files into GPU instructions.
nvcc flag to print out register/smem/lmem usage: --ptxas-options=-v