What is a memory transaction and a request?

For (what I am pretty sure are) perfectly coalesced accesses to an array of 4096 doubles, each 8 bytes, nvprof reports the following metrics on a Nvidia Tesla V100:

global_load_requests: 128
gld_transactions: 1024
gld_transactions_per_request: 8.000000

I cannot find a specific definition of what a transaction and a request to global memory are exactly, so I am having trouble understanding these metrics. Therefore my questions:

  1. How is a memory request defined exactly? Is it something like a warp-level load instruction for 32 threads at once?
  2. How is a memory transaction defined? Is it something like a load instruction of fixed size 32 bytes?
  3. Does gld_transactions_per_request = 8.00000 indicate perfectly coalesced accesses to doubles?

In the meantime, I have found the following hint at what a memory transaction is in the Best Practices Guide, Section 9.2.1:

  • "The concurrent accesses of the threads of a warp will coalesce into a number of transactions equal to the number of 32-byte transactions necessary to service all of the threads of the warp."

That leaves open questions one and three.