What is a memory transaction and a request?

an.roesti · March 5, 2020, 10:10am

For (what I am pretty sure are) perfectly coalesced accesses to an array of 4096 doubles, each 8 bytes, nvprof reports the following metrics on a Nvidia Tesla V100:

global_load_requests: 128
gld_transactions: 1024
gld_transactions_per_request: 8.000000

I cannot find a specific definition of what a transaction and a request to global memory are exactly, so I am having trouble understanding these metrics. Therefore my questions:

How is a memory request defined exactly? Is it something like a warp-level load instruction for 32 threads at once?
How is a memory transaction defined? Is it something like a load instruction of fixed size 32 bytes?
Does gld_transactions_per_request = 8.00000 indicate perfectly coalesced accesses to doubles?

an.roesti · March 6, 2020, 10:21am

In the meantime, I have found the following hint at what a memory transaction is in the Best Practices Guide, Section 9.2.1:

"The concurrent accesses of the threads of a warp will coalesce into a number of transactions equal to the number of 32-byte transactions necessary to service all of the threads of the warp."

That leaves open questions one and three.