K40 setup on Lenovo P510

Meta-question: is there a better place to ask this question? I’m not finding forums specific to Tesla cards.

I’m trying to set up a K40m in a Lenovo P510 workstation running Windows 10. There’s a Quadro K1200 installed which is driving the displays. The K1200 works fine.

The K40 is detected and the drivers installed, but in device manager I can see that the card isn’t working. The specific message is Code 10: Insufficient resources exist to complete the API.

The only pointer I can find online suggests I need to “Assign IRQ to VGA” but this seems like an outdated suggestion and I can’t find any BIOS settings that might address this.

Any pointers/suggestions or pointers to the right support forum would be most appreciated.

Both cards say their driver version is 24.21.13.9875, dated 7/24/2018.

K40m isn’t designed to be put into a workstation. The workstation will not provide proper cooling for it.

K40m is designed to be installed in a properly configured OEM server, that has been qualified and certifed by the OEM for K40m usage.

Sorry, this is a K40c with built-in fan. I read K40m off the windows device manager- I wonder if that’s a clue something is wrong with the card or with power.

If the system cannot/will not assign the necessary resources (which include memory mapped regions, I/O regions, and interrupts) during the PCI plug-and-play process, there is nothing that NVIDIA or the GPU driver can do about that.

You’ll need to see if the P510 has any system BIOS settings that involve mapping of resources, eg. above the 4G boundary. If you study carefully the device manager tabs for the display adapter corresponding to the K40c, you may get an idea of which resource is the problem. Typically it is the large memory BAR region associated with K40m that is often the deal-breaker for system compatibility.

In any event, it’s basically a Lenovo support question if the system will not assign the necessary resources. Be sure to update your P510 if it is not running the latest system BIOS.

Or choose another system. The K40c is not guaranteed by NVIDIA to work in any system you place it in.

I happen to have a K40c plugged into a Dell rack workstation, albeit running linux. Here is the lspci -vvv output:

82:00.0 3D controller: nVidia Corporation Device 1024 (rev a1)
        Subsystem: nVidia Corporation Device 0983
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 64
        Region 0: Memory at fa000000 (32-bit, non-prefetchable) 
        Region 1: Memory at 381fe0000000 (64-bit, prefetchable) 
        Region 3: Memory at 381ff0000000 (64-bit, prefetchable) 
        Capabilities: <access denied>
        Kernel driver in use: nvidia
        Kernel modules: nvidia, nvidia-drm, nouveau, nvidiafb

The largest of the 3 memory regions that need to be mapped is only 256M, so I think it’s unlikely to be a mem-BAR mapping issue in your case.

K40m by comparison has a 16G mem-BAR region that needs to be mapped:

04:00.0 3D controller: NVIDIA Corporation GK110BGL [Tesla K40m] (rev a1)
        Subsystem: NVIDIA Corporation 12GB Computational Accelerator
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 141
        NUMA node: 0
        Region 0: Memory at c9000000 (32-bit, non-prefetchable) 
        Region 1: Memory at 3c0800000000 (64-bit, prefetchable) 
        Region 3: Memory at 3c0c00000000 (64-bit, prefetchable) 
        Capabilities: <access denied>
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia

Thanks for the info. The system would not boot without first enabling mapping above the 4GB boundary. It’s interesting that the K40c needed this. I’ll see if lenovo support can yield any pointers. I’m optimistic I can get it working since according to this datasheet the P500 is supported, and I don’t think the P510 is very different:

How much installed system memory is in the workstation?

Currently 32GB. I have an upgrade to 128GB on order.

I’ll also check the power supply rating. It’s either 650W or 850W- the 650W supply may be insufficient. https://download.lenovo.com/pccbbs/thinkcentre_pdf/next_gen_power_configurator_v1.6.pdf

650W PSU seems OK, although at the limit of what is ideal. Rule of thumb: Total sum of rated power of all components ideally should be <= 0.6 * nominal wattage of PSU. Here we have:

Tesla K40c      235W
Quadro K1200     45W
CPU             100W  // guess
32 GB DDR4       13W  // @ 0.4W / GB
rest of system   10W  // assumes 1 HDD or 1 SSD @ 6W
----------------------------------------------------
total           403W

That is 0.62 the PSU power rating, or 62%.

Does this explicitly mean the OEM loads some sort of code onto the K40m that ensures it will only work with the OEM’s system?

I have 2 used K40ms (no fans) and have been trying to get either of them to work on an HPZ600 Workstation, with a Quadro K4000 as primary display.

The case side panel is off, and the K40m is powered externally by a 550w PSU, which is only used for the Tesla.

I’m running Win 7 x64 Pro. It has 2 x5650 Xeons and 48GB of ECC RAM.

Device Manager cannot assign enough resources for either K40m.

External cooling I can take care of, so it’s not an issue here and not the focus of the problem. I need to know/find out precisely what prevents this machine from assigning resources - is it a defective/damaged card, or did the OEM whose system it came out of do something to it which prevents use in other machines.

I am running the latest BIOS from HP for the Z600. I did not see any setting specifically referencing mapping of resources, or any sort of boundary. If this is due to OEM system-locking code loaded onto the card, I need to know how to remove it. GPU-Z cannot extract the BIOS from the card without triggering a Plug-and-Play BSOD, and TechPowerup doesn’t have a replacement/default Nvidia BIOS I can load onto it. This indicates to me there’s some system/brand-specific lock in the BIOS, and it triggers a hard reset when tampered with. It also BSODs if I try to manually install drivers from Nvidia’s site for it.

The Ebay listing stated “100% all data cleared and tested.” so I’m assuming the unknown OEM had some sort of proprietary lock in it?
As I don’t know what OEM it came from, I can’t exactly question them about drivers or BIOS edits they’ve done.

If it’s just a system incompatibility, I can dump it back to Ebay and maybe get what I paid. If it’s FUBAR, I need to find my sledgehammer and remove it from the GPU pool before some other poor shlep ends up in this boat.

Either way, the relevant info on Teslas is abhorrently lacking. It’s taken me a solid month just to find this much info.

It’s a system incompatibility. Tesla cards (with a few historical exceptions, e.g. C2075, K20c, K40c) are designed to be purchased and installed (only) in an OEM server system certified for their use. HP does not certify any of their workstations for any current Tesla cards, nor were any ever certified for K40m usage. If you buy a Tesla card believing you can install it in any system you want, you are asking for trouble. It is simply not possible, in the general case, and there is no design intent to make it possible.

There is no documentation to support this configuration. There is no OEM system load or card lock. The resources in question are resources that would be assigned to PCI BAR regions by the system BIOS during the PCI plug-and-play enumeration process. The K40m requires a large complement of resources. A system that cannot or will not assign these resources will cause the cards to be non-functional. There is nothing you can do to fix this (barring modification of user-accessible BIOS settings that modify the BIOS resource assignment behavior).

A server designed to support this card of course has taken these requirements into account in the design of the server, which includes the design of the system BIOS. It’s not a “lock” of any sort. In most cases, a PCIE Tesla GPU can be easily enough removed from a supported HP server configuration and placed in a supported Dell server configuration (just to pick a random pair/example), with full expectation that it should work normally.

But your workstation is not a supported configuration for that GPU. There are many statements like this on these forums.

Tesla K40 is an obsolete product. For non-obsolete products, you can find supported server configurations here:

https://www.nvidia.com/en-us/data-center/tesla/tesla-qualified-servers-catalog/

There is no suggestion anywhere that Tesla cards can be placed in any system you want with an expectation of proper behavior.

The principal issues when setting up Teslas are cooling, system BIOS configuration, and power supply.

These are passively cooled devices that rely on external fans to provide a specified amount of airflow across the heat sink fins (e.g. 15 to 25 CFM depending on intake temperature). BIOS issues may include setting up BAR0 apertures, which are typically much larger on Tesla GPUs than consumer GPUs, as well as PCIe configuration for peer-to-peer capability. Power supply is mostly about selecting a sufficiently beefy PSU, and cabling correctly.

As numerous posts in these forums attest, consumers who are not experienced with such matters tend to get one or several of these major issues wrong, plus many other small issues generally related to installing add-on cards. We have seen reports of very, shall we say, “adventurous” cooling and power supply solutions, for example. Tesla GPUs are expensive devices, and it is not in NVIDIA’s interest for customers to have an unhappy experience with them, up to and including permanent damage due to incorrect installation.

So the target customers for these GPUs are integrators: companies that (1) employ professionals that know what they are doing when configuring a system with these GPUs (2) are able to provide their own customer support for the resulting integrated system including GPU operation. NVIDIA keeps a list of approved integrators (partners), from regional suppliers to large multi-national corporations: https://www.nvidia.com/en-us/data-center/where-to-buy-tesla/

When people configure their own Tesla-based machine (many cases involved second-hand hardware from unknown sources), they are outside of NVIDIA’s intended sales channel, and they are pretty much on their own.

Thank you both for the detailed info. I bought these cards on November 20th, received them on about November 22nd, and have been up and down Google, HP, YouTube, and even stooped to asking IT professionals on Facebook, as well as searching this very website, every day for hours since then, and did not see any posts from anyone with this level of detail or clarity and logic. Most of what I saw was “these are intended for servers” and nothing else. No “why” other than “because”. It’s nice that someone actually got mad enough to give a detailed, logical answer to a legitimate question.

However, regarding the “there are many posts on these forums regarding this” - I found this thread on page 6 of search results. The first 5 dealt mainly with “K40 vs GTX vs M60”, and various *nix installations, and one which simply stated “power down, plug in, power up, it works” regarding a K40c with no other system info given by the author, nor such caveats as detailed here expressed by the Nvidia associate who said “it’s plug and play”.

I’m also well aware the K40 is an older, “dead” product. The extreme price drop on the secondary market, the removal of official documents from the main site, as well as various other Googlings pretty much spelled that out clearly. However, discontinued products that still function are “a thing”. There are 1957 Chevrolet Bel Aires that still work today as good as the day they rolled off the line in Detroit. There are Colt 1911 handguns that work just as well today as the day they were issued by the Army in 1911. Old people can be found in great abundance who are still functional members of society. The notion that “old is useless” went out about 100 years ago. This doesn’t change with computer tech. It may have limited use in a modern Enterprise operation, but the concept that computer technology has no business outside of business died in the 1970s. I was there. I saw it happen when the first home PCs came out. I had one.

So, thank you again for the help. These are going back on Ebay, with this information, because there are a lot of people out there looking at these particular cards for the same reasons I did, and they need to be forewarned.

As best I know, Kepler (compute capability 3.x) is still a supported architecture in the current CUDA version, 10.0. And given that the K80 first shipped four years ago, I would guess Kepler will be supported for another two years.

In practical terms I would consider the architecture obsolescent. As you no doubt are aware, different markets move at different speeds, depending on the advancement of technology. Even with cars you would eventually be hunting for spare parts at flea markets and swap meets, searching for after-market parts, and desperately trying to get a hold of a copy of the shop manual, because official vendor support has long ceased.

Five to six years is a standard time frame for an electronic product to hit the end of life stage. Support beyond that is simply not priced in and can be very expensive. With the death of Moore’s Law, we may see an extension of the “standard” lifetime of electronic equipment going forward, but if that happens it will be a gradual process. And those products will cost more, as the deflationary pricing pressure people have become accustomed to will have disappeared.

FWIW, a K40c is an actively cooled GPU for workstations that handles similar to an actively-cooled consumer GPU. You have a K40m, that is a different beast. Designed for servers and designed to be sold as part of a server. So next time around, consider buying an entire old server with old GPU(s) already in place.

For bragging rights: The first computer I used was a TRS-80 whose system memory had been upgraded from 4KB to a whopping 16KB :-) I don’t find working with such old equipment terribly exiting or useful (that TRS-80 delivered about 1 single-precision KFLOPS), but I still have a working 386-40 machine with dual Connor disk drives here, purchased around 1990. And since the AARP keeps sending me invitations to join, I assume that means I am officially an old person now.

In the hobbyist 3D rendering (Iray) circles I stagger through, Kepler makes up 90% of the user base. When Iray finally stops supporting it, a great cry shall issue forth from the multitudes, and there will be wailing and gnashing of teeth. Maxwell prices simply won’t plummet fast enough, far enough to keep that market swimming in affordable hardware (and by “affordable”, I mean “for people who walked away from real jobs years ago to focus on being an artist and depending primarily on their art to pay their bills because they thought they were that good”. Fortunately for me, I still have my real job because I have come to terms with the fact my image-based art isn’t that good. More people need to be honest with themselves, honestly.).

As for “official” support of older products, there’s a difference between taking a 69 GTO to GM for service and asking someone who works for GM “why do the lug nuts not line up with my 75 Mustang?” I came here to find out why the lug nuts didn’t line up, not for factory warranty work.

The K40m vs K40c thing confuses me; take off the fan shroud and they’re still the same card, right? 12GB VRAM, 2880 CUDA cores, all that? Meh. Lemmings over the cliff at this point. If I can’t edit the BIOS of my machine to unlock resources, I may as well dump them and focus on building an RTX rendering cluster.

I know this is old, but it supplied a lot of good info to the issue similar to the one above I’m dealing with: K40M on a Dell T7600. As per the above info it seems it’s not gonna happen. However, I do still have the technical need to deploy the card somewhere–within a budget. Are there any suggestions of legacy servers (e.g., HP Proliant or Dell PowerEdge) that are not state-of-the-art but do have the BIOS+hardware amenable to host these cards? Suggestions are appreciated.

Adding an idea to this old discussion. Maybe it’s possible to flash the K40m BIOS with that of the K40c? If the boards don’t differ too much (and I think they don’t apart from what is needed for the cooling fan), this could work. Has anyone tried this?

I looked at TechPowerUp for a K40C BIOS to try that, but didn’t find one. If the board has pins for a fan, I can put a hole in the shroud, or replace it with a GTX shroud with fans if it’ll fit, but getting any detailed information out of anyone here or at HP forums is impossible. No money changing hands, I guess.

As well, everyone seems to think there’s no such thing as mounting the GPU externally and powering it with its own power supply, because they keep mentioning how the PSU of the system I wanted to use it with was too weak for it, as if I didn’t effing know that already!

If the only difference between a K40M and K40C is the fan, there has to be something in the BIOS that’s looking for something that a Z600 workstation doesn’t have. Some bit of info from the CPUs or motherboard that it looks for, like a server key or something? I’m seeing posts from quite a few people who have a K40C running fine in a Z800 (same system with a bigger PSU, same OS, same CPUs), and they all said the same: it’s Plug And Play. Put it in, turn it on, load the drivers. But all I get when I ask about the K40M is “oh you have to put that in a server and you have to know how to configure them”. Like this information is only available via purchase? At least include the link to where I can buy that information if that’s the case!

This whole pursuit has simply reinforced my general hatred of people.

There is this one: https://www.techpowerup.com/vgabios/200846/200846

I actually just tried changing the BIOS, but nvflash gives a ‘GPU Mismatch’ error. It can detect one BIOS from the K40c and won’t write that to the K40m. It detects different firmware. So that’s that…

Chiming in to say that I bought a K40 and also couldn’t figure out how to get it to work with my desktop PC, it wouldn’t even boot. Thanks to info from this thread I found out that I was missing the setting in the BIOS to enable the PCIe setting to work with more than 4 GB, it even hinted that it was a setting that could be used for crypto mining. Enabling that setting made it work.

Thanks!