Need help in technical understanding of board design and flashing VBIOS of Tesla K80 cards

Hi Folks,

I also posted this in another NVIDIA forum (but it is 6d ago and no response - guess the forum choice was bad…), so please excuse if you get it twice…If you know a better place to post, please let me know, THX.

I’m struggling with some Tesla K80 for a few days…

Yes, old cards - but still good for my needs, so I’d really want them to run the way I think they should.

HW config: Supermicro 2028GR Server (so enough power is available), 4 Tesla cards - properly cabled and powered. (Very) short description of the problem(s): i) two of the cards won’t reach the max. GPU boost clock, ii) I am not able to find reliable (100% unmodified) VBIOS’es for the card and iii) I have a few technical questions…

Here is the output of “nvflash --list” (version 5.792 / linux x64):

[root@tesla80 Tesla_K80_flash_linux]# ./nvflash --list
NVIDIA Firmware Update Utility (Version 5.792.0)
Copyright (C) 1993-2022, NVIDIA Corporation. All rights reserved.

NVIDIA display adapters present in system:
Tesla K80 (10DE,102D,10DE,106C) S:00,B:04,D:00,F:00
Tesla K80 (10DE,102D,10DE,106C) S:00,B:05,D:00,F:00
Tesla K80 (10DE,102D,10DE,106C) S:00,B:83,D:00,F:00
Tesla K80 (10DE,102D,10DE,106C) S:00,B:84,D:00,F:00
Tesla K80 (10DE,102D,10DE,106C) S:00,B:89,D:00,F:00
Tesla K80 (10DE,102D,10DE,106C) S:00,B:8A,D:00,F:00
Tesla K80 (10DE,102D,10DE,106C) S:00,B:8D,D:00,F:00
Tesla K80 (10DE,102D,10DE,106C) S:00,B:8E,D:00,F:00
[root@tesla80 Tesla_K80_flash_linux]#

So let’s start with the technical qustions:

  1. As each K80 has two GPUs, I wonder if it is treated as two individual GFX Boards? When I have a look at the data of the two properly running cards (named B and B’ from now on - these are the cards with the GPUs 4-7), I can identify always “pairs” of “Board ID” an “Version” (this beeing the VBIOS Version, right?). Information was obtained with “nvflash --version -iX”, X beeing the GPU number.

Card B, GPU4:
Version : 80.21.1B.00.01
Board ID : 0xE505
Card B, GPU5:
Version : 80.21.1B.00.02
Board ID : 0xE504
Card B’, GPU6:
Version : 80.21.1B.00.01
Board ID : 0xE505
Card B’, GPU7:
Version : 80.21.1B.00.02
Board ID : 0xE504

So this would make sense for me, the last digit of the BIOS Version is correlated to a specific board number (1 → 0xE505, 2 → 0xE504).

However, this does not hold for the other cards, (named A and A’ from now on - these are the cards with the GPUs 0-3):

Card A, GPU0:
Version : 80.21.1F.00.06
Board ID : 0xE504
Card A, GPU1:
Version : 80.21.1F.00.05
Device Name(s) : Tesla K80
Card A’, GPU2:
Version : 80.21.1F.00.05
Board ID : 0xE505
Card A’, GPU3:
Version : 80.21.1F.00.05
Board ID : 0xE505

So, card A’ shows 2x the same Verion & Board ID, which leads me to two questions:

  1. is this normal / OK?
  2. is the Board ID coded in the hardware (i guess) or in the VBIOS?

What I’ve done so far:

I carefully read K80 application clock limited to 562 Mhz :-)

When I got the cards A and A’, they both ran on a max GPU clock of 562MHz. So I decided to change / modify the BIOS. I made a backup of the original BIOS (unfortunately just one, as I was not aware that this might be of importance…) and flashed the cards using one original BIOS of card B, leading the system to hang, both Linux (CentOS 7) and Win10 (I think at the very moment when the drivers are loaded, as Win10 wihout installed K80 drivers boots fine). Same behavior when I used the “pairs” of BIOSs from Card B to flash Cards A, A’ (and I cannot remember for sure, but I think at this time, Card A’ had also two different Board IDs…)
So I used “Kepler BIOS Tweaker” from techpowerup to modify the original BIOS I got from Card A, and flashed it to all GPUs of Card B, B’ (using the nvflash for win with the Board ID test disabled…), which particulary led to a success, as the first GPU of the Cards A, A’ ran at 875MHz, but the second one at lower clocks (mostly at 562MHz, sometimes a bit more, but never fullspeed).
As I was not able to find original, unmodofied VBIOSs for the K80, I decided to go for a firmware package of HP (containing the BIOSs versions 80.21.1F.00.06 and 80.21.1F.00.05). This was a *scexe and ran without any issues. However, similar behaviour as before, but the second GPU runs at even lower clocks…
After googeling (there is not much to find about the K80s), I found (by chance) ONE entry at the HP support describing the possibility to recover the original BIOS (which had the card at the time of shipping). So I guess there is a ROM on the card, which can be copied to the EPROM? This is a feature each electronic device using any kind of firmware should have…However, the tool is named iromflsh_ext and is stated to recover the “Info ROM” (is this the same as the BIOS? If yes, why different names? If no, what is this?), runs under linux and performed without any error, stating success. However, The BIOS Versions stayed the same, which I cannot believe to be true.

So, thank you for reading so far, I know a lot of text (but a lot of info, too).

What I would would like to know now:

  1. is the fact that one of the cards have two identical Board IDs normal? If no, how can this happen and how I fix this?
  2. has anyone of you experiences with the BIOS recovery tool iromflsh_ext? Is there really a backup VBIOS on the Tesla Cards?
  3. Why is there no section at the NVIDIA web space where one can easily get VBIOS (maybe is is, but where???)
  4. why is there no version of “nvflash” for linux with the Board ID check disabled :-)

How can you help me?

  1. answer one or more of my questions :-)
  2. point me to the aspects I have been missing or neglecting so far.
  3. help me with getting a VALID PAIR of VBIOSs for the K80, but I guess they must have at least version “1F”, preferably “80.21.1F.00.07” and “80.21.1F.00.08”. If you have a running system with K80s, its about 5min work to get them…

Experts, freaks, tweakers and modders, any help would be highly appreciated, if you need some more infos on the hardware, please let me know what is of interest.

Thanks a lot for your precious time,

Grettings from Vienna,