This is the response I got from Nvidia saying they did not know the answer and suggest resorting to the forum
Hello Steven,
Thank you for contacting NVIDIA Customer Care.
This is Subhanshu, assisting you with the query you have.
I understand from your email that you are experiencing performance issues with RTX 3090 Ti on your Linux system.
I apologize for any inconvenience this may have caused. Please be assured that I will do my best to help you further or point you in the right direction.
Please get in touch with us, if you need further assistance and I would be happy to help you.
Best Regards,
Subhanshu,
NVIDIA Customer Care
The two questions I had were. I am not getting CUDA toolkit to work properly on this card I just bought. The benchmarks are much slower than on my 3090. Is it that CUDA toolkit does not support 3090ti yet? Or should I expect the cuda toolkit to work on the 3090ti?
Second, nvidia-smi always reports a large current draw, even when inactive. This is consistent over many days. The card is cool. So clearly no current is really being drawn. All my 3090 cards show almost no current draw when idle. Is this normal? Could this be impacting the performance?
Thu Jun 9 13:49:30 2022
±----------------------------------------------------------------------------+
| NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … Off | 00000000:03:00.0 Off | Off |
| 30% 31C P8 396W / 450W | 58MiB / 24564MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 981 G /usr/lib/xorg/Xorg 46MiB |
| 0 N/A N/A 1112 G /usr/bin/gnome-shell 10MiB |
±----------------------------------------------------------------------------+
To give a more detailed issue with the 3090ti using Cuda.
If I run cuda-samples-11.1/bin/x86_64/linux/release/immaTensorCoreGemm
With a 3090 I get
Initializing…
GPU Device 0: “Ampere” with compute capability 8.6
M: 4096 (16 x 256)
N: 4096 (16 x 256)
K: 4096 (16 x 256)
Preparing data for GPU…
Required shared memory size: 64 Kb
Computing… using high performance kernel compute_gemm_imma
Time: 1.063936 ms
TOPS: 129.18
when I run with the 3090ti I get
Initializing…
GPU Device 0: “Ampere” with compute capability 8.6
M: 4096 (16 x 256)
N: 4096 (16 x 256)
K: 4096 (16 x 256)
Preparing data for GPU…
Required shared memory size: 64 Kb
Computing… using high performance kernel compute_gemm_imma
Time: 2.178048 ms
TOPS: 63.10
So… the 3090 is much more performant. Which makes me try to understand why the 3090ti is clearly not working as it should here, as far as I can see. (I see a similar poor comparative performance with my preferred application gromacs)
Looks somehow broken. I suspect since the power readings are clearly wrong, it will be limited to a crawl by this. does it actually change if you put some load on it?
Unfortunately, the 3090 ti isn’tt mentioned in the driver changelog at all so it’s hard to say when support for it was added. Did you already try a 510 driver and check power readings? Does this also happen under Windows?