The deviceQuery for 2080Ti says
(68) Multiprocessors, ( 64) CUDA Cores/MP: 4352 CUDA Cores L2 Cache Size: 5767168 bytes
Considering the fact that L2 is shared among all SMs, 5767168/68=84811.2941 which is not a power of 2 number. Usually, the number of sets, ways and block size are power of 2. For that number, we can estimate (S=41)(W=16)(B=128) which yields 83,968 bytes or (S=44)(W=15)(B=128) which yields 84,480 bytes.
Besides microbenchmarking, I am curious to know if there are more information about that.