L2cache size of A800 80GB

Hi, I tested deviceQuery sample in CUDASamples, part of the result output is shown below

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA A800 80GB PCIe"
  CUDA Driver Version / Runtime Version          12.2 / 12.1
  CUDA Capability Major/Minor version number:    8.0
  Total amount of global memory:                 81051 MBytes (84987740160 bytes)
  (108) Multiprocessors, (064) CUDA Cores/MP:    6912 CUDA Cores
  GPU Max Clock rate:                            1410 MHz (1.41 GHz)
  Memory Clock rate:                             1512 Mhz
  Memory Bus Width:                              5120-bit
  L2 Cache Size:                                 41943040 bytes

The results show that L2 Cache Size is 40MB, but as far as I could google, the L2 cache size of A800 80GB is 80MB, So what went wrong?

And someone say the A800 is made from two GA100 chips, each of which has 40MB of L2 cache, is that true?

A800 SKU is a single GA100 GPU. The L2 cache size is 40 MiB as stated in the device query.

Documentation on the GA100 GPU can be found in NVIDIA A100 Tensor Core GPU Architecture. p. 16 states the following:

40 GB HBM2 and 40 MB L2 Cache
To feed its massive computational throughput, the NVIDIA A100 GPU has 40 GB of high-speed
HBM2 memory with a class-leading 1555 GB/sec of memory bandwidth - a 73% increase
compared to Tesla V100. In addition, the A100 GPU has significantly more on-chip memory
including a 40 MB Level 2 (L2) cache - nearly 7x larger than V100 - to maximize compute
performance. With a new partitioned crossbar structure, the A100 L2 cache provides 2.3x the L2
cache read bandwidth of V100.

Hey @Greg , thanks for your answer, but only the A100 40GB information is described in documentation, does A100 80GB the same as A100 40GB?

Yes. A100 80 GB and A100 40 GB are both GA100 SKUs with different HBM capacity.