Check the ROP unit count under Linux? Affects all RTX 50XX cards

zebcom · February 23, 2025, 9:01am

Hi,

is there a way to get the actual number of ROP units on a GPU using Linux? I am asking due to the reports that some 50xx GPUs have been dispatched with a reduced number of ROPs. And this morning the first 5080 with reduced ROP has been discovered.

nvidia have already acknowledged the problem and that it was ground for replacement.

However, this can only be detected using GPU-Z application under Windows. Is there a way to get the info under Linux? I tried with nvidia-smi -q but it does not show that number it seems. Thanks a lot for any suggestion that does not involve installing Windows.

nvidia.abmwo · February 23, 2025, 9:17pm

I have been looking into this too. Have you found anything yet?

rs277 · February 23, 2025, 9:42pm

I’m in the Cuda world, which doesn’t really touch graphics capability. I did wonder whether Nsight Graphics, the profiler lists those stats somewhere, I haven’t looked.

Even if it did, it could of course use a lookup table for the specs, rather than the actual number.

zebcom · February 24, 2025, 7:49am

So far the only suggestion I had was to use GPU-Z on a portable USB key with Windows installed. Best way would be that nvidia-smi reported the same information.

It just happens that the first case of missing ROPs on a 5080 has been discovered. The other way would be to verify how this impacts the graphical benchmarks available on Linux, such as Unigine series. I will have a look at Phoronix Openbenmarking.org site.

zebcom · February 24, 2025, 12:43pm

Thank you for the tip. Worth a try I suppose. But you are right, there is also the risk they display the info from a database, not from the driver. And at this stage we do not even know if this info is even built in the Linux driver itself.

birdie · February 24, 2025, 1:39pm

I’ve just grepped the entire driver for ROP and the Raster.Output.Pipeline regex and found nothing.

nv-kernel.o_binary contains a single instance of ROP but it looks like it’s accidental.

Directly it’s seemingly neither retrieved, nor exported. Linux is again castrated and limped.

Like it was said earlier, you could install Windows on a flash drive, boot from it, and run GPU-Z. I see no other options unless you reverse engineer the GSP firmware to find out the API call to fetch this info. AFAIK, the firmware is the same between Linux and Windows.

zebcom · February 24, 2025, 1:44pm

Thank you very much for the feedback. I also grepped vulkaninfo outputs from 5080 and 5090 (found them on openbenchmarking.org) and I see no differences, apart from the device ID and memory heap sizes (32GB for the 5090 and 16GB for the 5080, as expected). So it does not seem to provide any useful info either in that regard :/ Unless nvidia engineers help with some API call, Windows on a portable drive seems to be the solution.

birdie · February 24, 2025, 2:01pm

ChatGPT continues to be stupid as hell.

It knows everything, but far too often you have to nudge it towards the right answer.

So the information should be there, but God knows how to retrieve it. Even on Windows, nvidia-smi is woefully incomplete. No idea how W1zzard from TechPowerUp managed to find the hidden API calls to extract it. Perhaps he could port GPU-Z to Linux, but given the number of Linux users and nothing like Win32, I suspect he will be reluctant to do so.

zebcom · February 24, 2025, 2:12pm

We could maybe ask GPU-Z authors which API call they did to retrieve that info. I suppose that nvidia-smi also makes call on the driver public API?

birdie · February 24, 2025, 2:14pm

I’m 99.99% sure those calls are under NDA or/and are a trade secret otherwise NVIDIA would have exported this info themselves in nvidia-smi.

zebcom · February 24, 2025, 2:17pm

GPU-Z would have access to those?

Otherwise, maybe in NVAPI? NVAPI Reference Documentation
There are a few “Raster” entries in the API, but it is not clear if they return the ROP number.

Or maybe in the CUDA driver API?

zebcom · February 24, 2025, 4:24pm

So I asked directly in GPU-Z forum. The response from the dev was: “If it’s not listed in the public docs it’s under NDA. Only way is to reach out to NVIDIA for NDA access to NVAPI”.
He has not answered, if he has the NDA access, but I suppose so. At least we know he uses the NVAPI.

birdie · February 24, 2025, 4:42pm

@aplattner

Aaron, is there a specific reason why these NVAPI calls are under NDA?

Logically I cannot understand it why NVIDIA would hide them. We already have GPU-Z that uses them, you cannot call them unless the card is already supported by the driver. This information is not something worth hiding. It’s just confusing. I understand your stance towards Hot Spot temperature (could be something you never intended to export) or VRAM temp (AFAIK it has different ways to access it and they are all proprietary).

BlueGoliath · February 24, 2025, 7:53pm

Can’t believe you asked ChatGPT lol.

zebcom · February 25, 2025, 5:00am

If you anything useful to contribute in regards to the original topic, please do not hesitate to share. Otherwise, how about you GTFO?

morgwai666 · February 25, 2025, 8:55am

This may be a stupid idea (I have no clue), but would it be possible to run GPU-Z with Wine on Linux?

birdie · February 25, 2025, 9:15am

Actually GPU-Z works under Wine (GPU-Z 2.62/Wine 10.0) but I cannot vouch for the correctness of the displayed data.

AFAIK W1zzard has claimed somewhere that when the respective NVAPI calls fail, GPU-Z uses the internal data instead which means you cannot trust its output under Linux/Wine. Wine must be able to route low-level NVAPI calls to the underlying NVIDIA Linux libraries and I’m far from certain it actually does that. I’m almost sure it does not.

As you can see CUDA, DirectCompute, DirectML, Ray Tracing and PhysX are all missing even though CUDA and Ray Tracing are perfectly supported under Linux.

Even the driver version is incorrect. It displays 536.25, I have 565.77 installed.

zebcom · February 25, 2025, 9:31am

No on Reddit some other users have reported it does not work under Wine. Worse, as birdie explained, GPU-Z falls back to an internal database when the driver does not report a value, leading to misleading data. Hence, under Windows, the driver has to be installed to show the missing ROP number. Otherwise it would show the normal number even on a defective card.

I will try to build a Live Windows on USB stick with Ventoy (as explained here: How to Run Windows From a USB Drive | PCMag) and report back if I am successful.

Again, having help from nvidia would be good there. I looked at the NVAPI, and it is not clear what each function returns: you get handles and some data, so maybe the data is there without even an NDA, but not explicitely described.

nvidia.abmwo · February 25, 2025, 11:21am

Hey, I’m that guy on Reddit and I can confirm it was populating the wrong data. It was showing the wrong version number and said I had more ROPs than I should. I’m hoping someone can get the live cd working.

March first when the new ISO is released I’ll be doing a system wipe and reinstalling Linux. Before I reinstall Linux I’m going to sacrifice the virginity of one of my drives annd install windows to check the value. If I have the correct amount maybe we can find a benchmark test to run and compare values.

I did see something about an nvidia debug tool but I haven’t had time to look into it.

zebcom · February 25, 2025, 11:58am

Hey thanks for the feedback.

I have seen some tutos on how to create live Windows on USB drives, will attempt this from today. If others are interested, these are the ones I saw:

The 2 first are using the same way, this is what I will try first. Note although you create a VHD, the OS runs bare metal, not on a VM. This is important again for GPU-Z that does not work in VM either.

The key thing being a live Windows OS from the stick, not an installation image. That way you do not need to feedle with bootloader and partitions.

Topic		Replies	Views
Nvidia-smi not present in Jetson Linux Jetson AGX Orin nvidia-smi	6	15273	February 7, 2023
Source for tegrastats and/or info about querying overall GPU utilization? Jetson TX1	8	12074	October 18, 2021
GPU load monitoring tool Now available! CUDA Programming and Performance	21	81386	December 7, 2009
Request: GPU Memory Junction Temperature via nvidia-smi or NVML API Linux	362	86937	April 20, 2025
Under CentOS 7.1 system, eight GPU cards are checked with nvidia-smi command, one card is error, one card is lost, and when nvida-smi command is executed, carton is slow. Linux	3	809	January 8, 2019
GeForce GTX 460 & CUDA 3.1 (What is deviceQuery reporting?) CUDA Programming and Performance	8	10858	August 15, 2010
GTX480 to C2050 hack or unlocking TCC-mode on GeForce CUDA Programming and Performance	79	241487	January 29, 2019
Pascal Titan X's GPU's falling off the bus Linux	0	880	December 29, 2016
About the NVIDIA Virtual GPU Drivers category NVIDIA Virtual GPU Drivers	1	1543	February 29, 2024
Running operf on NVIDIA Jetson TX2 Linux	5	798	February 8, 2018

Check the ROP unit count under Linux? Affects all RTX 50XX cards

Related topics