The best article I could find for this technology comes from PC Perspective. Note, many review sites do have some details wrong.

This is nVidia’s Detail:
Tesla C870 is the card.
Tesla D870 is the previous QuadroPlex design (2 Tesla C870 cards).
Tesla S870 is the 1U rackmount (4-8 Tesla C870 cards).

According to nVidia’s website, there is no mention of PCIe Gen 2, although the review sites claim this technology will be used for the D870 and S870 adapter cards. This might be another option, in which the user can choose from 8x and 16x PCIe Gen 1 and 8x and 16x PCIe Gen 2.

A worthy thing to note is nVidia’s suggestion to keep a 1 GPU to 1 CPU core ratio. Considering Supermicro has released a 1U rackmount featuring 2 motherboards, each with 2 sockets for Intel’s Quad Core Xeon 5300’s, a 16 GPU and 16 CPU setup can be achieved in just 3U (using 1 of Supermicro’s SuperServer 6015T-TV with 2 Tesla S870 with 8 GPU’s each). Thats 8 TeraFLOPS of GPU power alone in 3U.

A correction needs to be made. It appears that an adapter is needed for every 4 GPUs. Considering the design I gave above only supports 1 PCIe 8x slot per motherboard, the design would be impossible (if not impossible, the link would be a huge bottleneck). Hopefully Supermicro or Tyan will introduce a similar design to the SuperServer 6015T-TV that features 2 PCIe 8x/16x Gen 1/Gen 2 connectors per mini-motherboard.

Congratulations to nVidia on a well developed dense computing product.

I saw that they changed the memory a bit (double but a bit slower). Any details on amount of constant memory, shared memory etc.?

Its definitely a very interesting platform.

The Telsa hardware specifications are basically the same as the Quadro FX 5600 - the sizes of the internal memories are unchanged.

Congratulation for the new computing stack! Any plan for the integrated infiniband version (6015T-INFV) of the computing nodes? A language (CUDA) supported RDMA would be also a huge win in this area.

Thanks, a lot of people here have worked very hard on CUDA and Telsa, and they deseve all the credit. It’s certainly an exciting time to be involved in GPU computing!

I can’t comment on plans for future hardware, but we’re always investigating ways to improve system bandwidth.

The Tesla rackmount server is sweet. I can’t wait to build a cluster of them :)

I do have a question on the shared PCIExpress setup. How does it work? Is the full bandwidth of the x16 slot available to each card, one at a time or do you get something like x4 bandwidth to each device?

I ask because a typical clustered implementation of my application puts 1/N of the problem on each of N procs, which need to communicate every timestep. My initial testing so far indicates that the PCIExpress transfers won’t be too much of an overhead for this (~20%), but that testing is done in my current system with one x16 slot -> one 8800 GTX… Multiplying that overhead by 4 would ruin any performance gains from a multiple GPU setup.

What would be really cool is if one could do transfers from one GPU to another GPU in the Tesla server, without going to the CPU and back :)

Is Tesla going to be fully IEEE754 compliant?
If so I presume it isn’t just a rebadged GPU but has some changes made to the actual architecture?

Likewise I read on HPCWire that double precision is coming out by the end of the year. Will this be fully IEEE754 compliant.


is it compute capability 1.0? or more?

Current Tesla products have the same floating point behaviour as the existing GeForce and Quadro products and support compute capability 1.0.

The Tesla board was designed for professional applications where a Telsa computing board is added to a system for a specific computing application. The computing board does not have graphics connectors and can be combined with a range of graphics boards. In addition, the Tesla board supports double the memory of the highest end GeForce graphics board. Depending on your application, GeForce, Quadro or Tesla may be the best fit depending on the features, product life or graphics features you require.

Will Tesla use a separate, non-video-card driver? (This would eliminate the conflicts with cards from other video adapters.)

In that vein, would the Tesla driver be able to drive a 8800GTX card?

Thanks – looking forward to this product!

CUDA already uses a separate driver, although it does share some functionality with the graphics driver (resource management etc.), and they are distributed together.

What do you mean by “conflicts with cards from other video adapters”?

Maybe he means having an AMD/ATI graphics card for display with a tesla card for computation?


just want to Supermicro has two 1U servers whcih can take a full sized PCi-E video card. This 1U DP quad core Xeon 6015A-NTV http://www.supermicro.com/products/system/…S-6015A-NTV.cfm

and the 1U 4-way quad-core opteron

already has support for PCi-E x16 riser card and will work with full height full length video cards that are not of “double” height or thickness. This means thick cards which take the space of 2 expansion slots like nVidia 8800 won’t work but the ones of regular thickness (but full height and full length) will be fine.


The Tesla card, The Quadro FX 5600 and 4600, the 8800 GTX and GTS are all taking two slots, and they also require 1 or 2 PCIe power connector (difficult to find in 1U).
The Tesla and the FX5600 are also 12" board.
A better solution for a rackmounted environment is the Tesla deskside ( and the upcoming 1U Tesla S870) that requires only a x16 low profile/half lenght daugthercard in the 1U server.