Titan V FP16 Performance

rjulian · December 8, 2017, 7:18pm

Can someone from NVIDIA provide a solid spec for the Titan V’s FP16 performance?

I’ve seen 15TFLOPS FP32 and 110TFLOPS using the Tensor Cores, but no spec in the marketing materials for FP16.

Robert_Crovella · December 8, 2017, 7:34pm

FP16 not using TensorCore should be at double the FP32 rate, for this V100 based product. This is a characteristic of the V100 device, and similar to all other GPUs with full-rate FP16 throughput (i.e. sm_53, sm_60, sm_62, sm_70) This general principle is observable here:

[url]Programming Guide :: CUDA Toolkit Documentation

rjulian · December 8, 2017, 7:51pm

Thanks for the response.

I suspected as much, but wanted confirmation for the Titan V, since previous GTX incarnations did not have FP16 support like their datacenter counterparts (i.e. Titan Xp vs P100).

Can you confirm that the Titan V has native FP16?

cbuchner1 · December 13, 2017, 2:34pm

It’s confirmed here

there’s a DP4A hardware instruction on the V100 chip.

Robert_Crovella · December 13, 2017, 3:14pm

dp4a has nothing to do with FP16 computation. You are thinking of INT8

The Titan V is a compute capability 7.0 device.

[url]https://www.reddit.com/user/hellotanjent[/url]

The FP16 throughput (not using TensorCore) for compute capability 7.0 is given in the table I already linked:

[url]Programming Guide :: CUDA Toolkit Documentation

cbuchner1 · December 13, 2017, 3:22pm

Oh you’re right. Sorry about the misinformation. I’ve been focusing too much on integer stuff recently.

Topic		Replies	Views
Does Nvidia Titan x have native FP16 and int8 support? CUDA Programming and Performance	7	7403	August 12, 2016
Cuda 9 FP16 CUDA Programming and Performance	5	1837	August 5, 2017
Titan RTX and Titan V CUDA Programming and Performance	18	12703	August 11, 2019
FP16 support on gtx 1060 and 1080 GPU-Accelerated Libraries math-api	14	25423	May 19, 2021
What is the TFLOPS for CUDA/Tensor Cores with FP16 on V100? CUDA Programming and Performance	9	273	December 10, 2024
FP32 and FP16 activity during a pure 32bit float CUDA application is running CUDA Programming and Performance	4	1103	April 26, 2018
is FP16 running only on the Volta? TensorRT	8	2884	October 12, 2021
Question regarding Tensor Cores/GV100 CUDA Programming and Performance	8	2532	August 12, 2017
Why tensor cores can't do FP32 arithmetic? CUDA Programming and Performance hw	4	147	December 10, 2024
Titan V white paper and int8 tops CUDA Programming and Performance	2	1925	April 9, 2019

Titan V FP16 Performance

Related topics