Nvidia A2: FP64 performance is lower than specified in specs

matthias.wurm · October 27, 2023, 11:07am

We have a A100, V100, A2, and GTX 1050 Ti.
We have tested the V100, A2, and GTX 1050 Ti in different (Windows) systems regarding their FP64 performance.
The V100 and the GTX 1050 Ti behave as expected. But the A2 FP64-performance is lower than that of the GTX 1050 Ti.

According to this information List of Nvidia graphics processing units - Wikipedia the A2 should offer 140 GFLOPS (FP64) and the GTX 1050 Ti ~66 GFLOPS (FP64).
So, it should be at least 2 times faster.
But with CUDA-Z and Matlab I can only achieve ~75% of the FP64-performance of the GTX 1050 Ti.

What’s going wrong here?

cbuchner1 · October 27, 2023, 11:31am

" Double-Precision Tensor Cores are among a battery of new capabilities in the NVIDIA Ampere architecture, driving HPC performance as well as AI training and inference to new heights."

Is the A2 spec sheet maybe listing tensor core performance, while the 1050Ti is listing performance of CUDA cores? If so, you’d need a specialized benchmark that makes use of the Tensor cores.

matthias.wurm · October 27, 2023, 12:15pm

Thanks for the reply.
But using the same benchmark on our A100 (Ampere architecture) even shows better results than expected, not worse.

Robert_Crovella · October 27, 2023, 2:23pm

A2 is limited by power. It’s expected that you won’t be able to achieve peak performance.

Here are related threads: 1 2 3

While the A2 is running whatever test you are running, you may wish to use nvidia-smi -a to see in the section of “clocks throttle reasons” whether power-capping is occurring. If so that is one possible explanation for not achieving whatever you are expecting to achieve.

Topic		Replies	Views
What is the FP64 Rpeak value for A5000? GPU-Accelerated Libraries hw	1	1052	November 24, 2021
How to test FP64 (no tensor core) in A100 CUDA Programming and Performance cuda	6	150	November 7, 2025
Performance of A100 vs. V100s for mixed pression CUDA Programming and Performance	1	1226	December 3, 2021
Looking for full specs on NVIDIA A5000 CUDA Programming and Performance	2	3506	June 16, 2022
Double precision tensor core performance on A100 CUDA Programming and Performance cuda , a100 , ampere	1	1136	July 7, 2023
How to calculate the Tensor Core FP16 performance of H100? CUDA Programming and Performance	9	7968	August 14, 2024
FP64 Performance - Power Limitation - H100 vs A100 CUDA Programming and Performance	13	669	January 19, 2026
Mixed precision GEMM Performance (A100 & V100) CUDA Programming and Performance	1	1561	December 3, 2021
NVIDIA A5000 - How to get full specs and how to compare cards? Computer Vision & Image Processing gpu , benchmarks , jetson	3	1657	June 11, 2026
Working precision in Nvidia A6000 nvc, nvc++ and nvfortran	2	555	October 3, 2022

Nvidia A2: FP64 performance is lower than specified in specs

Related topics