Why RTX4090 performs at a level much lower than officially claimed?

Hello,

I’m running a performance test on RTX4090 using gpu_burn, the result shows that the computing power of my RTX4090 is merely 54TFLOPs (for FP32), which is way lower than corresponding 82TFLOPs claimed on Nvidia official website (NVIDIA GeForce RTX 4090 Graphics Cards)

And my case is not alone, I also asked another party to do the test on their 4090, the result is the same just 50TFLOPs.

Also in both our cases, it shows that 4090 can merely run at P2 level (as known P0 is the highest performance level). Wonder if that correlates and why is it running just at p2 instead of p0?

It shouldn’t be due to hardware configuration reason like cpu or motherboard. The detail is as below:

CPU: Intel Xeon 8368Q

Motherboard: Supermicro X12SPA-TF (PICe4.0 x16)

RAM:128G DDR4 RECC3200

See attachment for the test results:


Looking forward for reply soon, thanks!

Because theoretical GPU computational power shown in TechPowerUP doesn’t take care about memory bottleneck. And the Nvidia driver forces the GPU into P2 state when CUDA applications are running, so GPU memory clock will not be maxed out: Remove "P2 forced" state from drivers

2 Likes