integer arithmetic capabilities of Tesla GPUs & definition of terms

wkailey · December 6, 2017, 3:16pm

Many NVIDIA GPU datasheets, for example the P40 & P100, use the term “INT8” leaving developers to guess what exactly this means. Well, yes, I can guess that it means support for 8-bit arithmetic, but could you please provide YOUR definition so I am not reduced to guessing?

My other question in this are is: what about 16-bit arithmetic operations? The application I am working on requires 16-bit integer operations, and I see terms like “FP32” and “FP16” and even the aforementioned “INT8” on several of the data sheets, but nowhere do I see anything like “INT16”. I have an application that uses 16-bit integer arithmetic. Is it supported by these machines or not?

Thanks!

njuffa · December 6, 2017, 3:50pm

Yes, you can operate on 16-bit data with a GPU. Use the appropriate type in your CUDA and C++ code, e.g. ‘short’, ‘unsigned short’, ‘int16_t’, ‘uint16_t’. CUDA also offers predefined packed types like ‘ushort2’ but not operations on those types, i.e. these are mostly useful for optimizing memory accesses.

Note that for languages in the C++ family (this includes CUDA) it is specified that during the evaluation of expressions integer data with types narrower than ‘int’ is converted to ‘int’ first before entering into the computation. On all platforms supported by CUDA, ‘int’ is a 32-bit type. So the use of 16-bit types mostly has the benefit of reducing storage requirements in memory, but as a trade-off can require additional conversions.

wkailey · December 6, 2017, 4:12pm

Thanks, njuffa. I am familiar with the C automatic data conversions, yes. But, come to think of it, the data sheet makes no mention of any “INT” or “INT32” at all, so it kind of leaves the impression that only 8-bit ints are supported (as wildly unlikely as that would seem).

I have in fact already developed the first prototype of my application in Cuda C and C++, and it is right now running on an M60 GPU. I am looking to productize it and thinking of using a P40 GPU for that; but I was a little put off by seeing “FP32”, “FP16”, and “INT8” on the data sheet with no mention of any “INT16”. This, in addition to being a little annoyed that such acronyms are thrown around, but nowhere on the NVidia Cuda website is any definition of the acronyms provided. These may be standard acronyms in somebody’s world, but they are not standard in my world. uint16, int16, etc. are fairly standard in my experience, but the precise meaning of “INT8” is anyone’s guess to an old C and C++ guy like me.

cbuchner1 · December 6, 2017, 4:28pm

INT8, INT16 and FP16 are usually mentioned specifically in datasheets when there are native hardware arithmetic instructions that can deal with these data types such as the __dp2a() and __dp4a() instructions on Pascal that can accelerate AI inference.

njuffa · December 6, 2017, 4:31pm

The data sheet (not sure what you are looking at) presumably talks about hardware capabilities. That is what data sheets tend to do. Does the data sheet for a 64-bit CPU call out the fact that it can also process 16-bit data? Probably not. Because some of the most recent GPUs have special instructions for operating on INT8 (8-bit integer) data, it is called out in the data sheet. That does not mean GPUs were incapable of operating on 8-bit integer data previously.

GPUs are essentially 32-bit architectures, with some extensions to allow 64-bit addressing, for example. But C++ (and thus CUDA, which is currently based on C++11) abstracts from the machine, and you can use all the usual integer types.

Become a globetrotter :-) FP32 is one way of referring to single-precision floating-point (typically implied: of the IEEE-754 kind). In the lingo of IEEE-754 – the relevant standard – one would call that ‘binary32’, C/C++ folks would usually call that ‘float’ (although the language standard specifies no such equivalency!), and in older Fortran code it may be referred to as ‘REAL*4’. FP16 is half-precision floating point, defined as a ‘binary16’ storage format in the IEEE-754 (2008) standard. FP16 support is only slowly making its way into various programming languages.

wkailey · December 6, 2017, 4:43pm

Totally cool. Thanks, guys. You’ve answered all my questions.

Topic		Replies	Views
A question about calculation of integer (or short integer) and float data CUDA Programming and Performance	8	3358	April 4, 2014
16-bit vs 32-bit Integer Arithmetic Performance CUDA Programming and Performance cuda	3	900	April 21, 2024
do bool/ char types imply inherent type conversion? CUDA Programming and Performance	3	1484	November 2, 2014
16 bit float operations CUDA Programming and Performance	2	7645	April 7, 2015
PTX,... does comparing a bit either a 0 or 1 take 64 bits? CUDA Programming and Performance	3	497	April 13, 2018
How FP32 and FP16 units are implemented in GP100 GPU's CUDA Programming and Performance	8	7547	March 28, 2017
Is it possible to have FP Unit and INT Unit in a same core work in parallel? CUDA Programming and Performance	11	3628	March 5, 2019
CUDA intrinsics? CUDA Programming and Performance	7	3591	November 16, 2017
error when trying to use half (fp16) CUDA Programming and Performance	16	20065	October 13, 2015
16 bit int multiplication using SIMD / mixed precision CUDA Programming and Performance	7	1871	October 12, 2021

integer arithmetic capabilities of Tesla GPUs & definition of terms

Related topics