Details of Global and L2 cache configuration in Tesla K40


I am curious about some configuration details of the global memory and L2 cache of Tesla K40. There are few questions that I was searching but couldn’t get much information and so I thought of asking here. My questions are listed below :

  1. How many banks are present in the global memory of K40?
  2. I saw that the global memory bus width is 384 bits; but what is the width of each bank (like if I exceed
    that width then the data would be allocated in the next bank)?
  3. How the whole global memory is divided into each bank; I mean size of each bank?
  4. I searched for a microbenchmark but couldn’t find it. So is there a microbenchmark available that could
    help me to understand this details?
  5. How the L2 is connected to each of the global memory banks?
    In this link for maxwell the L2 banks are same as the global memory banks and connected to each
    bank. Is this same for the K40 as well? Then how the same questions would go for L2 as well and is there a microbenchmark to find those details for L2 ?

Any help would be really appreciable. Thank you.