Yeah, that’s what I was thinking. I removed the formerly installed driver and installed the latest one available for Linux, version 580.105.08. Following the documentation, this driver explicitly supports devices RTX A2000 12 GB and RTX 4000 Ada Generation. I can’t find the GeForce RTX 4080 listed anywhere, but I suppose it should be supported as well. While we experience the mentioned cuts in memory throughput running my programme on the RTX 4080, at least there the PoC generally works. On the RTX 4000 Ada Generation, though, we still get “no cuda capable device detected”, even with the latest driver installed. I got no idea why we can’t get it running there while on Windows it works using this device.
You did inspire me to do some more research by this remark. Until now I didn’t know quite how powerful nvidia-smi is. I found this nvidia-smiquery: nvidia-smi --query-gpu=timestamp,name,pci.bus_id,driver_version,pstate,pcie.link.gen.max,pcie.link.gen.current,temperature.gpu,utilization.gpu,utilization.memory,memory.total,memory.free,memory.used --format=csv -l 5
I did a quick test under Windows, using this query while doing several runs of my PoC code on my own machine (RTX A2000 Laptop):
nvidia-smi on RTX A2000 Laptop
nvidia-smi --query-gpu=timestamp,name,pci.bus_id,driver_version,pstate,pcie.link.gen.max,pcie.link.gen.current,temperature.gpu,utilization.gpu,utilization.memory,memory.total,memory.free,memory.used --format=csv -l 2
timestamp, name, pci.bus_id, driver_version, pstate, pcie.link.gen.max, pcie.link.gen.current, temperature.gpu, utilization.gpu [%], utilization.memory [%], memory.total [MiB], memory.free [MiB], memory.used [MiB]
2025/12/01 09:55:22.377, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 55, 0 %, 0 %, 4096 MiB, 3965 MiB, 0 MiB
2025/12/01 09:55:24.392, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 56, 0 %, 0 %, 4096 MiB, 3885 MiB, 81 MiB
2025/12/01 09:55:26.449, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 56, 0 %, 0 %, 4096 MiB, 3885 MiB, 81 MiB
2025/12/01 09:55:28.679, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 57, 0 %, 0 %, 4096 MiB, 3885 MiB, 81 MiB
2025/12/01 09:55:30.852, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 57, 0 %, 0 %, 4096 MiB, 3885 MiB, 81 MiB
2025/12/01 09:55:32.863, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 58, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:35.038, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 58, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:37.198, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 58, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:39.370, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 59, 62 %, 8 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:41.511, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 59, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:43.668, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 59, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:45.848, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 59, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:47.871, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 60, 1 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:49.949, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 59, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:52.126, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 60, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:54.300, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 61, 100 %, 16 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:56.464, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 60, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:55:58.619, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 60, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:00.793, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 60, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:02.808, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 62, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:04.974, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 61, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:07.153, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 61, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:09.306, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 62, 78 %, 48 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:11.494, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P5, 4, 2, 62, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:13.632, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 61, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:15.798, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 62, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:17.815, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 63, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:19.982, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 62, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:22.018, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 62, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:24.177, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 62, 45 %, 38 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:26.186, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 63, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:28.336, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 62, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:30.508, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 63, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:32.582, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P3, 4, 4, 63, 48 %, 6 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:34.785, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 63, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:36.958, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 63, 0 %, 0 %, 4096 MiB, 3881 MiB, 85 MiB
2025/12/01 09:56:39.125, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, P8, 4, 1, 63, 0 %, 0 %, 4096 MiB, 3783 MiB, 183 MiB
2025/12/01 09:56:41.143, NVIDIA RTX A2000 Laptop GPU, 00000000:01:00.0, 573.57, [Unknown Error], 4, 4, 55, [Unknown Error], [Unknown Error], 4096 MiB, 3965 MiB, 0 MiB
It’s interesting to see that whenever my code does CUDA calculations, the performance state drops from P3 to P8 while at the same time the current PCIe link drops from 4x to 1x. Although I can’t say for sure right now, I suspect this is what happens on the GeForce RTX 4080, too. There, it appears to only occur under Linux, though. It would pretty much explain why memory throughput stays behind the system’s possibilities. Doing a quick research on why this might be happening, I stumbled upon your reply to a question in this forum, referring to this blog entry. I will have a closer look at that tomorrow and ask my colleague to run some more tests on his private device. Maybe we’ll still find a solution to the issue, or at least an explanation.