compitable servers for S1070 collect some information

mianlu · January 15, 2010, 1:59pm

Hi All:

Our department wants to buy a Tesla S1070. I know there is a compatibility list in the CUDA website, however, I still want anyone to help me confirm particular systems can work. Since we have already met troubles even the server is in the list, such as R410. So if you have already used the Tesla S1070 well, please tell me your server information. If it needs some special requirements, please also let me know. I really do not want to buy a server that finally cannot use the Tesla S1070 again :( Thanks very much!!

mianlu · January 17, 2010, 5:28am

no one uses S1070??

eyalhir74 · January 17, 2010, 11:45am

Hi,

We’re currently using 20 S1070 connected to 10 Super Micro machines (7046GT-TRF7046GT-TRF-TC4) with varying RAM sizes (average 36GB).

All running linux and seems to be working fine (we’ve just recently installed them).

Currently we’re facing issues connecting 3 S1070 per machine (using nVidia’s DHIC card). It would have been nice if we could connect 4 S1070

(making it 16 GPUs per machine) - but I still don’t know if it is possible and still having problems with connecting 3 as I said.

eyal

mianlu · January 17, 2010, 1:53pm

Nice!!! Thanks!!!

Since we only consider one GPU-per machine currently, it looks will be fine. However, could you point out which one exactly in this list http://www.nvidia.com/object/tesla_compati…orms.html#s1070 ,please? Since I have no idea about the numbers you give me (7046GT-TRF7046GT-TRF-TC4)… Thanks again!

eyalhir74 · January 17, 2010, 2:10pm

Sorry the number got merged :)

7046GT-TRF:

http://www.supermicro.com/products/system/…-7046GT-TRF.cfm

7046GT-TRF-TC4:

http://www.supermicro.com/products/system/…TRF.cfm?GPU=TC4

Why one one GPU per machine? you mean one S1070 per machine? why not 2 S1070?

eyal

mianlu · January 17, 2010, 2:26pm

Thanks a lot! Actually our department currently only wants to buy one S1070… mostly just for research purpose :) You have so many S1070, it should be really cool for computing!

mfatica · January 17, 2010, 5:17pm

I have several SuperServer 6016GT-TF and they work very well.
These are 1U solutions, with good pci-e transfer on both slots.

mianlu · January 17, 2010, 6:00pm

yes… but still I want to know what’s the problem of connecting 3 S1070 you have meet? since I see the website said they can support up to 4

mianlu · January 17, 2010, 6:01pm

Thank you~ it looks like Super Micro should be reliable.

eyalhir74 · January 17, 2010, 7:36pm

Currently i fail to run the simpleMultiGPU sample from the SDK. When I use more than 10 GPUs it crashes randomly

saying there is no CUDA enabled device. I’ve opened a bug in the developer site, hope they can assist.

Assuming 3 will go fine, we’ll test 4 S1070. However there might be a problem there since there will be no space/place

for the card for the screen - I’m not sure how that will work.

In anycase I’ll be happy with 3 S1070 - 4 will be amazing… :)

eyal

mfatica · January 17, 2010, 7:50pm

You could use a GHIC ( Graphic Host Interface card) in one of the slots.
It is like a normal HIC for the S1070 but it has an integrated GPU and display output.
I will ask the driver team if there is a limit in the number of GPUs that the driver can enumerate. It could also be a BIOS issue.

eyalhir74 · January 17, 2010, 7:53pm

Thanks for the fast answer - we’re trying to update the BIOS this week - its from last september.

Using the GHIC - will it come instead of one S1070? so that way i can only put 3 S1070 and one GHIC?

thanks

eyal

mfatica · January 17, 2010, 8:06pm

To connect an S1070, you need two HICs, each driving 2 of GPUs inside the S1070 vis the thick cable.
The GHIC is just like the normal HIC but it has on-board GPU to drive a display and an additional connector for a monitor.

eyalhir74 · January 18, 2010, 7:21am

Hi,

We’re currently using the DHIC cards so we should be able to therotically connect 4 S1070, right?

If I put 4 S1070 with 4 DHIC cards, can I still use the GHIC? I guess not… the GHIC will require a slot of its own, right?

Anyway here’s the bug reference for the 3 S1070 and simpleMultiGPU test I ran (#642453):

Synopsis: simpleMultiGPU fails with 12 GPUs (3 S1070) We’re using a SuperMicro host ( 7046GT-TRF ) connected to 3 S1070 (total of 12 GPus).

When running the simpleMultiGPU code from the SDK with 8 GPUs everything runs fine.

When running with 12 GPUs we get the following error:

RUN 1:

-bash-3.2$ ./simpleMultiGPU

CUDA-capable device count: 12

main(): generating input data…

main(): waiting for GPU results…

Running kernel on device [0]

Running kernel on device [1]

Running kernel on device [2]

Running kernel on device [8]

Running kernel on device [10]

Running kernel on device [5]

Running kernel on device [3]

Running kernel on device [4]

Running kernel on device [9]

Running kernel on device [7]

Device [6] failed

Device [11] failed

Running kernel on device [6]

cutilCheckMsg() CUTIL CUDA error: reduceKernel() execution failed.

in file <simpleMultiGPU.cpp>, line 74 : no CUDA-capable device is available.

RUN 2:

-bash-3.2$ ./simpleMultiGPU

CUDA-capable device count: 12

main(): generating input data…

main(): waiting for GPU results…

Running kernel on device [0]

Running kernel on device [1]

Running kernel on device [3]

Running kernel on device [5]

Running kernel on device [7]

Running kernel on device [9]

Running kernel on device [11]

Running kernel on device [6]

Running kernel on device [2]

Running kernel on device [4]

Running kernel on device [10]

Device [8] failed

cudaSafeCall() Runtime API error in file <simpleMultiGPU.cpp>, line 62 : no CUDA-capable device is available.

Please advise

-------------------- Additional Information ------------------------ Computer Type: PC System Model Type:

System Model Number:

CPU Type:

Video Memory Type:

Chipset Mfg:

Chipset Type:

Sound Card:

CPU Speed:

Network:

Modem:

North Bridge:

South Bridge:

TV Encoder:

Bus Type: AGP

OS Language:

Application:

Driver Version: cudadriver_2.3_linux_64_190.18 System BIOS Version:

Video BIOS Mfg:

Video BIOS Version:

Direct X Version:

Monitor Type:

Monitor 1:

Monitor 2:

Monitor 3:

Video 1:

Video 2:

Video 3:

Resolution:

Color Depth:

Products: other

Application Version:

Application Setting:

Multithreaded Application: yes

Other open applications:

Release: Public

OS Details:

Problem Category:

How often does problem occur: Every time Video Memory Size:

CPUs (single or multi): 2

RAM (amount & type): 36 ddr3

AGP Aperture Size:

DJK · July 16, 2010, 9:19pm

Any update on this?

Did you ever get 3 or 4 1070’s per node?

Hi,

We’re currently using the DHIC cards so we should be able to therotically connect 4 S1070, right?

If I put 4 S1070 with 4 DHIC cards, can I still use the GHIC? I guess not… the GHIC will require a slot of its own, right?

Anyway here’s the bug reference for the 3 S1070 and simpleMultiGPU test I ran (#642453):

Synopsis: simpleMultiGPU fails with 12 GPUs (3 S1070) We’re using a SuperMicro host ( 7046GT-TRF ) connected to 3 S1070 (total of 12 GPus).

When running the simpleMultiGPU code from the SDK with 8 GPUs everything runs fine.

When running with 12 GPUs we get the following error:

RUN 1:

-bash-3.2$ ./simpleMultiGPU

CUDA-capable device count: 12

main(): generating input data…

main(): waiting for GPU results…

Running kernel on device [0]

Running kernel on device [1]

Running kernel on device [2]

Running kernel on device [8]

Running kernel on device [10]

Running kernel on device [5]

Running kernel on device [3]

Running kernel on device [4]

Running kernel on device [9]

Running kernel on device [7]

Device [6] failed

Device [11] failed

Running kernel on device [6]

cutilCheckMsg() CUTIL CUDA error: reduceKernel() execution failed.

in file <simpleMultiGPU.cpp>, line 74 : no CUDA-capable device is available.

RUN 2:

-bash-3.2$ ./simpleMultiGPU

CUDA-capable device count: 12

main(): generating input data…

main(): waiting for GPU results…

Running kernel on device [0]

Running kernel on device [1]

Running kernel on device [3]

Running kernel on device [5]

Running kernel on device [7]

Running kernel on device [9]

Running kernel on device [11]

Running kernel on device [6]

Running kernel on device [2]

Running kernel on device [4]

Running kernel on device [10]

Device [8] failed

cudaSafeCall() Runtime API error in file <simpleMultiGPU.cpp>, line 62 : no CUDA-capable device is available.

Please advise

-------------------- Additional Information ------------------------ Computer Type: PC System Model Type:

System Model Number:

CPU Type:

Video Memory Type:

Chipset Mfg:

Chipset Type:

Sound Card:

CPU Speed:

Network:

Modem:

North Bridge:

South Bridge:

TV Encoder:

Bus Type: AGP

OS Language:

Application:

Driver Version: cudadriver_2.3_linux_64_190.18 System BIOS Version:

Video BIOS Mfg:

Video BIOS Version:

Direct X Version:

Monitor Type:

Monitor 1:

Monitor 2:

Monitor 3:

Video 1:

Video 2:

Video 3:

Resolution:

Color Depth:

Products: other

Application Version:

Application Setting:

Multithreaded Application: yes

Other open applications:

Release: Public

OS Details:

Problem Category:

How often does problem occur: Every time Video Memory Size:

CPUs (single or multi): 2

RAM (amount & type): 36 ddr3

AGP Aperture Size:

eyalhir74 · July 18, 2010, 7:30am

We have 2 S1070s connected per machine, recently we connected 2 S2050s per machine as well.

Nothing changed since the previous posts… :( but we also didnt push it either too much.

eyal

DJK · July 19, 2010, 7:15pm

Are you using the two of the Dual Host Interface Card (DHIC) or four of the HIC?

eyalhir74 · July 20, 2010, 7:16am

Currently 4 HICs.

DJK · July 20, 2010, 5:29pm

Thanks for the information. We tried four of the HICs on the Tyan S7025 motherboard, but could not get Windows 2008 HPC R2 to work (will work with only 3 of the HIC). Will try the same with the SuperMicro motherboard.

Also, we have some of the DHIC ordered, so if we can get 2 of the 1070’s going, we’ll try getting 4 1070’s per node. With the 1 CPU per GPU recommendation, 2 1070’s (8 GPUs) is all that’s recommended for a two socket node.

eyalhir74 · July 21, 2010, 6:20am

We now have, in production, at least 10 supermicro machines with 2 S1070 connected to it and running ~24x7 for the last 6-8 months. No problems !! :)

It does use linux, however and not Windows.

Also bear in mind that the DHIC will cut the PCI bandwidth in half so if this is a bottleneck in your application the DHIC will make it worse.

eyal

Topic		Replies	Views
Tesla S1070 under RH 5.3 S1070 not detected correctly by T5500 CUDA Programming and Performance	3	3069	July 6, 2009
Tesla S1070 With CUDA 6.5? CUDA Setup and Installation	5	1560	October 28, 2016
Tesla S1070 server hardware trying to process 100s of GBs of data CUDA Programming and Performance	17	13909	January 27, 2009
P2P access not enabled, is this a software or a hardware issue? CUDA Setup and Installation	7	9633	November 10, 2015
very large data set (big matrix) CUDA Programming and Performance	10	3002	October 17, 2009
Driver Installation for Tesla K80 - Problems CUDA Setup and Installation	17	6559	January 18, 2020
Tesla S1070 GHIC Card & Windows Server 2008 R2 CUDA Programming and Performance	10	11860	July 2, 2010
Problems with CUDA drivers for NVIDIA Hardware CUDA Setup and Installation	9	1266	October 27, 2020
four C1060 vs. one S1070? CUDA Programming and Performance	18	14188	January 14, 2011
Tesla S1070 Bandwidth Problem CUDA Programming and Performance	16	11407	March 31, 2009

compitable servers for S1070 collect some information

Related topics