Recently,I want to summarize a list about core size and computation speed briefly.Such as the size and speed of these cores like FP32,INT32,INT16,INT8 and INT4.But I can’t find this type of information.When I searched, I always found the introduction about whole framework of gpu production, but little description in hardware details.If you know how can i find the information,please help me. Thanks a lot.
Are you using TK1 device?
TK1 should only support general float instruction.
The instruction like HMMA (Half-Precision Matrix Multiply and Accumulate) and IMMA (Integer Matrix Multiply and Accumulate) is only available on device with Tensor Core hardware.
Thank you for your reply. But I haven’t understood. I saw the structure of Volta GV100 Streaming Multiprocessor before, and found that there are three kinds of cores in it: FP32 , FP64 and INT32. I wonder if there are GPU structures that use INT8 and INT6 hardware cores. If there is a description of such cores, I would like to know their size and computing speed. Or INT8 is just a precision now, not the core of INT8
Would you mind to share the document of Volta GV100 Streaming Multiprocessor with us first?
Just want to make sure we answer your question correctly.
Volta is the GPU architecture with Tensor Core.
So you can measure the performance when running different precision.
Here is the document I saw before. On page 13, there is a picture of GPU architecture.At the beginning, my problem is we can see INT, FP64 and FP32 cores in this figure, but there is no describtion about hardware size of these core. In addition is there any other core with INT6 precision and INT8 precision(maybe in other products)? So another description of my problem is I don’t want to use general hardware core (that can calculate with different precision like int8, int16, etc.), whether there is a special core (some of the int8 precision operation hardware optimization is better, some of the INT6 precision operation hardware optimization is better). Thank you very much for your reply
Just want to confirm first.
Are you trying to find a similar document for TK1.
The supported feature in TK1 is pretty limited.
I want to know more than TK1. I’d like to know about any of nvidia products, as long as it contains INT4,INT8,INT16,INT32,FP32 and other computing cores. I want to know the size and speed of these cores.
We don’t have such information for Jetson platform.
But you can find some information for the desktop GPU whitepaper.
For example, Jetson Xavier is using Volta GPU.
So you can refer to the white paper of Volta GPU as you shared above.