3080 fp16 performance poor?

Looking at this resnet bench the 3080 doesn’t seem to be much of an improvement for fp16: https://www.pugetsystems.com/labs/hpc/RTX3090-TensorFlow-NAMD-and-HPCG-Performance-on-Linux-Preliminary-1902/

Is this just a factor of lack of optimisation (e.g. will tensorrt support change this picture) or is the 3080, as some have been suggested, just not that great for fp16?

Hi @joemtgyu,
There should be some improvement in performance (we cant say how much), and stability in general with the CUDA 11.1 build of TRT once it’s available. The primary change for 3080 users is SM 8.6 support.


Thanks. Shame that fp16 didn’t see the speed up that fp32 did.

I stumbled onto this thread and wanted to say that fp16 on RTX3080 is actually very good. It’s more than double that of fp32. My more recent post

is better than that early post on the 3080.

I will refresh all of the ML testing early in the new year and things should be better overall because of the relentless updates from NVIDIA (that you very much!)

I just did a ResNet50 benchmark on the RTX3070 to see if there has been improvement for that since the above post but it is still very poor for some reason??? Hopefully that will improve after a few more updates …

Interesting. Looking at this post: Reddit 3070 post they managed to get fp16 working very very well with 3070 after initially bad results. Fingers crossed as I’ve got a 3070 FE being delivered next week.