A GPU has multiple physical instances of units. For example, a GPU may have 32 Streaming Multiprocessors (SMs). .avg is the average value for all SM instances. .sum is the total count for all SMs.
The GPU memory subsystem is also physically partitioned into multiple instances. The .avg is useful when determining log balancing via comparison to .max and .min and .avg is useful when determining the unit throughput with .avg.pct_of_peak_sustained_elapsed.
If trying to determining kernel memory throughput use .sum.per_second.