I’m in the process of setting up a Tesla S1070 connected to a server running Windows 2008R2 64-bit. The Tesla driver installs nicely, but in the device manager one of the four Tesla CPUs is marked as not being installed correctly. Upon further inspection the reason is “This device cannot find enough free resources that it can use. (Code 12)”
Has anyone else experienced this problem?
The server is an HP Proliant DL380 G5 with 2 quad core Xeons, 16GB RAM, 2 GigE NICs and a HP P400 RAID controller. The RAID controller is attached to one of three PCIe 8x slots, while the Tesla boards are connected to two PCIe 16X slots on a riser card. (It is not possible to move the RAID controller to any of the other slots due to the bulk of its connectors). BIOS, RAID-controller and NICs are updated with the latest firmware. The Tesla driver version is 196.28
I have no experience with either the OS or hardware you are using, but the two obvious things that might bite you are PCI-e enumeration and the video memory aperture size. With 4x Telsa T10 GPUs, you are trying to map 16Gb of video memory. It could be that the BIOS has some arbitrary limit which is less than than what you need. The other possibility is that the OS isn’t handling the NForce 200 switches correctly and PCI-e device enumeration isn’t correct. I know that there have been issues with enumeration of WDDM versions of desktop Windows with multiple GTX295 cards, although I have no idea whether that could apply to Server 2008. NVIDIA has recently released a compute card driver for Vista/7 and derivatives. It might be worth giving that a try.
Is there any way (a special tool f. ex) I can check the maximum aperture size, would be nice to know if it’s the hardware or the OS that’s causing the problem.
I think this is the driver I already have, it is not a normal graphics driver, but especially for the headless Tesla products. It is also labeled Windows 7 in the setup screens. Or is there yet another driver?