@dwoods Can you share a bit more about what l1tex__data_bank_reads actually represents? My hypothesis is that it should be proportional to the number of data banks accessed (both as shared memory or L1 cache), but this hypothesis is at odds with the profiling results of my kernel (more details here), so I would really like to learn about the definition of this counter. Thanks!