What is the TFLOPS for CUDA/Tensor Cores with FP16 on V100?

I couldn’t find the TFLOPS value for CUDA Cores with FP16 precision on NVIDIA’s official website… I just find H100 and A100. Could anyone kindly provide that to me?

For example here:
https://www.nvidia.com/en-us/data-center/h100/

It does not show half cuda core

non-tensor FP16 should be double the FP32 throughput, when operating on half2 type, for add, multiply, and multiply-add, on cc7.0. This is based on the published per-SM throughput.

1 Like

Volta can do 64 FMAs (16-bit) per Tensor core per cycle.
1 FMA has 2 FLOPs (Multiply+Addition).
Volta has 8 Tensor Cores per SM (later generations beginning with Ampere are fixed at 4 Tensor Cores).
V100 has 84 SMs (or 80?) and
depending on sub-model between 1290 and 1455 MHz boost clock frequency (alternatively the base clock depending on how you use it).
Just multiply it together.

1 Like

the v100 datasheet indicates FP16 tensor core perf

1 Like

Thanks!!! So… can I put it this way: FP16’s throughput is 2 times of FP32, so cuda core’s multiply of FP16 is about 30TFLOPS, and Tensor core is about 120TFLOPS? (For V100)

Yes, on V100 (compute capability 7.0) the 16-bit is double as fast (bandwidth) as 32-bit, see CUDA C++ Programming Guide (chapter Arithmetic Instructions). Sometimes the computation cores can do one bit-width (e.g. 16-bits or 32-bits or 64-bits) or several or only integer or only floating-point or both.

For Tensor Cores you find comparative numbers for different datatypes (with the V100 as the first Tensor Core GPU quite limited to only 16-bits) here: CUDA - Wikipedia (not officially from Nvidia, collected by the community). With Tensor Cores, especially on consumer cards, there is often a difference between different widths of accumulation.

1 Like

“Volta can do 64 FMAs (16-bit) per Tensor core per cycle.” This is very helpful! Do you know that for Hopper?

Oh, never mind. I see it here: CUDA - Wikipedia

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.