The CUDA Binary Utilities document has a list of the assembly instructions for Compute Capability 1.2 and above.
[url]CUDA Binary Utilities :: CUDA Toolkit Documentation
The Parallel Thread Execution ISA Version 3.2 (PTX) has information on the PTX intermediate language which has a very close mapping to the final assembly instructions.
[url]PTX ISA :: CUDA Toolkit Documentation
The best approach for learning how the GPU works is to use the Nsight VSE CUDA debugger and cuda-gdb and single step the assembly for different programs. If you are not set up to debug then simply writing small sample programs and using cuobjdump or nvdisasm to list the PTX and SASS (assembly) is fairly easy way to learn.