We witness a 85% CUDA performances drop - 1/6th of the original performances (!!!) - when using the (latest) available drivers on our Linux (Debian/Stretch) K20, K40 and K80 setups, using:
- driver 410.78 (latest nVidia, general GPUs) along CUDA 9.2.148
- driver 410.72 (latest nVidia, Tesla-specific) along CUDA 9.2.148
- driver 384.130 (stock Debian/Stretch) along CUDA 8.0.44
We’ve been able to restore expected performances using:
- driver 375.26 (old “forported” custom Debian/Jessie) along CUDA 8.0.44
This affects all our Kepler hardware (corresponding to a hundred-odd thousands $$$ of investment) and prevents us to move our infrastructure to CUDA 9.x (since incompatible with 375.xx drivers).
Is anyone aware of the issue ?
Is there any known work-around, e.g. parameters passed to the drivers (modprobe) or via nvidia-smi ?
Thanks in advance for your support,
PS: I can post our crude pyCuda benchmarking script if needs be (although is does nothing else than time the CPU-to-GPU memory transfer of two 1024MB vectors and their subsequent addition, subtraction, multiplication and division; as simple as it gets)