I’m trying to get the new 2.0 beta kernel drivers (174.55) to work with the Tesla C870. Previous drivers (such as 169.09) appear to work fine with the same C870 hardware. Also unusual is that the beta driver (174.55) does work with a Quadro NVS 290 card (also installed on the machine); it’s just the C870 card that has the problem.
Actually, to be more specific, I think the C870 might work with the beta drivers, but just be extremely slow. For example, the bandwidth test gives Device-Device copies of 200 MB/s, whereas I was expecting ~64 GB/s.
I’ve included some more information about my system. Thanks in advance for your help.
Linux version 2.6.18-53.1.14.el5 (mockbuild@builder6.centos.org) (gcc version 4.
1.2 20070626 (Red Hat 4.1.2-14)) #1 SMP Wed Mar 5 11:36:49 EST 2008
CentOS release 5 (Final)
kernel
NVIDIA-Linux-x86-174.55-pkg1.run
./bin/linux/release/bandwidthTest
Using device 1: Tesla C870
…
Device to Device Bandwidth
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 199.6
…
&&&& Test PASSED
I have been having similar problems too. With a Quadro NVS 290 and a Tesla C870 on a dual socket 8-core Xeon system, the bandwidth test shows only about 4.4 GB/s of device-to-device bandwidth. This is much lower than the 64 GB/s I get with my 8800 GTX on Core2 Duo.
I have even tried replacing the Tesla with 8800 Ultra and the device-to-device is even lower - about 200 MB/s. Is this a problem with the drivers or the hardware? I have tried both CUDA 1.1 and 2.0 and it is the same with both versions.
@gopher
This is most likely a kernel or BIOS problem. Have you verified that you’re using the latest motherboard BIOS? What kind of motherboard are you using?
Running dmidecode gives me the following information. I have removed most of the extraneous info from the output. The system is a Dell workstation with dual socket Xeon processors.
SMBIOS 2.5 present.
123 structures occupying 4842 bytes.
Table at 0x000F0450.
BIOS Information
Vendor: Dell Inc.
Version: A01
Release Date: 01/31/2008
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 1024 kB
Characteristics:
PCI is supported
PNP is supported
APM is supported
BIOS is upgradeable
BIOS shadowing is allowed
ESCD support is available
Boot from CD is supported
Selectable boot is supported
EDD is supported
Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
3.5"/720 KB floppy services are supported (int 13h)
Print screen service is supported (int 5h)
8042 keyboard services are supported (int 9h)
Serial services are supported (int 14h)
Printer services are supported (int 17h)
ACPI is supported
USB legacy is supported
BIOS boot specification is supported
Function key-initiated network boot is supported
Targeted content distribution is supported
BIOS Revision: 0.0
Handle 0x0100, DMI type 1, 27 bytes.
System Information
Manufacturer: Dell Inc.
Product Name: Precision WorkStation T7400
Version: Not Specified
Serial Number: xxxxxxxxxx
UUID: xxxxxxxxxxxxxxxxxx
Wake-up Type: Power Switch
SKU Number: Not Specified
Family: Not Specified
Handle 0x0200, DMI type 2, 8 bytes.
Base Board Information
Manufacturer: Dell Inc.
Product Name: 0RW199
Version:
Serial Number: xxxxxxxxxx
Processor Information
Socket Designation: CPU
Type: Central Processor
Family: Xeon
Manufacturer: Intel
I am having a similar problem. I have two Tesla C870 running in Linux-x86_64 Fedora 8.
With previous drivers 169.09 and Cuda 1.1 I had 64000 MB/s device to device bandwidth, but after upgrading to Cuda beta 2.0 with driver 177.13 I have only 57091 MB/s device to device.
Like gopher I have dual socket 8-core Xeon ( dell Precision WorkStation T7400). I updated Bios to latest version but that didnt change anything.
Is anyone experiencing something similar ?
I am experiencing a similar problem with the beta driver 177.13. Also on the Precision T7400.
Edit 6/30:
I tried upgrading the BIOS from A01 to A02 with no luck. I tried updating the kernel, from 2.6.18-53.1.14.el5 to 2.6.18-92.1.6.el5, also with no luck. I believe this is the 32 bit kernel.
My bandwidth numbers are very similar to Kravell’s.