900-21001-0300-030 A100 40G test test result

hi Experts,

like to check on this.

i bought this part recently: 900-21001-0300-030

however, there is some discrepancy on the test result competing with expected results as per datasheet:

Discrepancy on Actual result Expected results as per Datasheet
Shaders/ Cuda Cores 6272 6912
TMUs/Tension Cores 392 432

is this normal? could any expert can explain on this?

many tks!

Which test app did you run?

Possible explanations (non-exhaustive list): (1) OEM version (2) counterfeit product (3) engineering sample. I cannot find any GPU with 6272 CUDA cores in online databases of GPUs.

Did you buy this GPU from a reputable vendor? Is this supposed to be brand new merchandise in original packaging or was this a second-hand offering? NVIDIA usually sells GPUs like the A100 only to system integrators, which I assume you are not. If you visually compare your A100 with pictures of A100 GPUs on the internet, does it look the same (e.g. location of power connectors)?

An A100 in MIG mode will have 98 SMs available. 98x64 = 6272.
I don’t know if that applies here or not. It’s easy to tell if MIG mode is enabled based on nvidia-smi.

1 Like

Hi Robert,

thanks for the reply, i will go and try it.

btw how about the TMU/tension core? why is the actual result is 392 TMUs/tension cores instead of 432 as per DataSheet?

hope you can advise.

A100 has 4 Tensor Cores per SM. If only 98 SMs of the 108 are available, that is 98 * 4 = 392 Tensor Cores

1 Like

hi sir,
do you know why only 98 is available instead of 108? is there any theory on this? hope anyone can enlighten us

From the A100 whitepaper:

A GPU Instance is constructed from multiple “GPU slices”, where each GPU slice includes a
“Sys Pipe” (defined below), one GPC, one L2 slice group (an L2 slice group includes 10 L2
cache slices), and access to a portion of frame buffer memory. The A100 GPU supports a total
of 7 GPU slices. Note: In MIG operating mode, the single GPC in each GPU slice has seven
TPCs (14 SMs) enabled, which allows all GPU slices to have the same consistent compute
performance.

7*14 = 98

2 Likes