Hello,
I saw a similar topic just below mine with lower performance on Kepler hardware, but decided to open a new thread.
In my case it’s an RTX 2080 TI running on Linux Ubuntu 18.04, same problem occurs in Ubuntu 16.04.
I get about 35-40% less performance in Linux than in Windows using CUDNN backend with fp16 precision and using driver version 410.72 when running a chess neural network engine. With fp32 precision the performance is as expected (and same as in Windows).
With fp16 I’m also observing a lower GPU utilization of 40-50% (while it should be 95-100%).
Nvidia bug report is attached, it was running when GPU was loaded.
cuDNN version: v7.4.1 (Nov 8, 2018)
CUDA version: 10.0
Same problem with previous CUDNN version 7.4.0.
It looks to me as if the GPU just wouldn’t be used at full power at this mode.
Edit:
Solved by correcting compile options. Performance in Linux is ok now.