What does gld_transaction mean in nvprof metrics?

Hi, everyone.

I wanted to test texture cache line size on Pascal GPU, so I wrote some simple code to read a 1D array to register and write to another array.

__global__ void cacheLineTest(const float* src_, float* des_, unsigned int stride){
  int tid = blockIdx.x*blockDim.x+threadIdx.x*stride;
  des_[tid] = src_[tid];

and I got following results.

I have three questions:

  1. From these results, can I tell that 1D texture cache line of Pascal is 32 bytes?
  2. What does gld_transaction really mean? Why gld_transaction differ from L2_tex_read_transaction when stride is 1 and 3.
  3. I neither use restrict nor ldg(), why my load request still went through Texture + L2 path.