How my data is cached

Hello to everyone!

Let’s say that i have an array d_a in the global memory of the gpu, that i have malloced it as cudaMalloc(&d_a,1000*sizeof(int)); I would like to know, if i read the d_a[0] in my kernel, what other elements of my array will be cached? Generally how my data of my array is cached ? And where are they cached (L1, L2)?
The architecture of my gpu is Maxwell.