Ask for some materials to self-study hardware architecture / organization of CUDA

Useful knowledge is usually not available in university classrooms, especially of those whose ranking is out of 20. Now I want to study the hardware architecture / organization of CUDA. It is not covered by any course of my current university. All the CUDA textbooks are focusing on software aspects of CUDA, i.e., how to programming CUDA. Architecture / organization textbooks available are only talking about CPU. So, what materials can I use to self-study hardware architecture / organization of CUDA? Assume I have finished Andrew S Tanenbaum’s “Structured Computer Organization” (6ed). I hope the material is written like this classic text, not the published deep papers written for those who have already acquired the knowledge I want to acquire. Thank you for the recommendation.

PMPP (now in 3rd edition) by Hwu/Kirk covers the most hardware architecture of any of the CUDA texts that I’m aware of. You won’t find the details of it like you will some CPUs since NVIDIA does not publish those kind of details externally.