I am currently running a larger simulation and I have discovered that my results differ between CUDA and CPU. Now several colleagues with 3090 / 4090 run exactly the same code yet do not suffer from this deviations.
So the general “precision on GPU…” does not really work here as an explanation. Is it a Titan Xp specific issue? Really hard to find anything on the topic.
My Setup:
Titan Xp with Nvidia 550 Driver
Python 3.8.19
Conda as environment manager
Pytorch 2.3.1 with CUDA 12.1