bandwidth of GPUs

my server contains 8 graphic cards,(8 GPUs)

when I input the command:
numactl --cpunodebind=0 --membind=0 /root/cudasdk/C/bin/linux/release/bandwidthTest --memory=pinned --device=all

the result is:

Running on…

Device 0: GeForce GTX 470
Device 1: GeForce GTX 470
Device 2: GeForce GTX 470
Device 3: GeForce GTX 470
Device 4: GeForce GTX 470
Device 5: GeForce GTX 470
Device 6: GeForce GTX 470
Device 7: GeForce GTX 470
Quick Mode

Host to Device Bandwidth, 8 Device(s), Pinned memory, Write-Combined Memory Enabled
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 4947.9

Device to Host Bandwidth, 8 Device(s), Pinned memory, Write-Combined Memory Enabled
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 4323.6

Device to Device Bandwidth, 8 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 89814.6

when I input the command:
numactl --cpunodebind=0 --membind=0 /root/cudasdk/C/bin/linux/release/bandwidthTest --memory=pinned --device=all --mode=shmoo

then the result is:



30720 25414.8
32768 26260.1
34816 26876.6
36864 27463.1
38912 28053.2
40960 28630.0
43008 29107.8
45056 29664.6
47104 30112.6
49152 30557.6
51200 30974.6
61440 32736.8
71680 34124.6
81920 35225.0
92160 36238.4
102400 36978.5
204800 40917.4
307200 42425.5
409600 43242.8
512000 43734.8
614400 44057.4
716800 44332.6
819200 44519.1
921600 44657.7
1024000 44779.8
1126400 44559.5
2174976 45159.3
3223552 45381.0
4272128 45495.2
5320704 45564.0
6369280 45606.3
7417856 45643.8
8466432 45667.6
9515008 45686.6
10563584 45702.0
11612160 45715.6
12660736 45727.0
13709312 45737.1
14757888 45745.4
15806464 45755.7
16855040 45756.0
18952192 45773.4
21049344 45790.2
23146496 45795.0
25243648 45802.7
27340800 45808.5
29437952 45812.6
31535104 45817.3
33632256 45819.3
37826560 45826.5
42020864 45830.3
46215168 45834.7
50409472 45837.7
54603776 45839.6
58798080 45842.5
62992384 45844.8
67186688 45846.8












Device to Host Bandwidth, 8 Device(s), Pinned memory, Write-Combined Memory Enabled
Transfer Size (Bytes) Bandwidth(MB/s)
1024 1889.5
2048 3628.7
3072 5234.9
4096 6746.6
5120 8148.3
6144 9465.4
7168 10726.5
8192 11921.2
9216 13023.8
10240 14067.5
11264 15043.3
12288 16030.1
13312 16918.2
14336 17733.1
15360 18566.0
16384 19352.3
17408 20058.8
18432 20789.0
19456 21388.3
20480 22096.7
22528 23299.4
24576 24407.7
26624 25377.0
28672 26355.5
30720 27244.1
32768 28054.6
34816 28714.5
36864 29527.0
38912 30152.4
40960 30813.7
43008 31359.2
45056 31917.4
47104 32482.7
49152 32966.5
51200 33423.4
61440 35375.4
71680 37014.3
81920 38297.0
92160 39455.3
102400 40323.9
204800 44908.8
307200 46721.4
409600 47705.9
512000 48287.6
614400 48650.4
716800 48854.7
819200 49080.0
921600 49276.5
1024000 49426.1
1126400 49057.2
2174976 49874.0
3223552 50173.1
4272128 50325.9
5320704 50415.4
6369280 50476.4
7417856 50518.9
8466432 50554.7
9515008 50581.5
10563584 50601.4
11612160 50618.2
12660736 50633.1
13709312 50645.4
14757888 50649.9
15806464 50547.4
16855040 50374.5
18952192 50060.6
21049344 50040.4
23146496 50055.2
25243648 50062.0
27340800 50065.5
29437952 50071.0
31535104 50087.9
33632256 50089.3
37826560 50095.6
42020864 50097.0
46215168 50106.0
50409472 50111.7
54603776 50118.2
58798080 50121.0
62992384 50124.0
67186688 50125.7












Device to Device Bandwidth, 8 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
1024 7262.2
2048 15604.5
3072 23259.5
4096 31108.7
5120 38523.3
6144 45580.6
7168 53125.6
8192 60421.6
9216 67830.7
10240 74703.7
11264 81986.0
12288 88278.6
13312 95144.0
14336 102452.5
15360 109463.3
16384 115977.6
17408 122885.2
18432 128076.8
19456 134699.9
20480 140513.5
22528 154263.0
24576 164487.3
26624 176863.8
28672 189946.6
30720 203124.4
32768 212384.6
34816 222868.4
36864 232674.3
38912 245111.6
40960 254289.2
43008 264265.3
45056 274363.9
47104 282259.6
49152 293537.6
51200 299556.7
61440 300564.8
71680 335064.6
81920 366600.5
92160 398631.3
102400 430692.7
204800 583425.6
307200 674584.4
409600 559979.6
512000 577389.5
614400 596240.2
716800 608410.9
819200 619893.8
921600 630345.9
1024000 637422.9
1126400 643796.6
2174976 676496.4
3223552 689523.1
4272128 696977.9
5320704 699947.9
6369280 702998.2
7417856 705962.5
8466432 705973.6
9515008 708093.7
10563584 708775.9
11612160 708505.4
12660736 709390.9
13709312 709102.6
14757888 710008.8
15806464 710127.8
16855040 710610.4
18952192 711056.5
21049344 711578.6
23146496 712431.3
25243648 712587.6
27340800 713011.4
29437952 712934.1
31535104 712973.9
33632256 715746.3
37826560 716474.5
42020864 714404.2
46215168 718161.9
50409472 716891.2
54603776 717777.8
58798080 718351.6
62992384 718188.3
67186688 717945.2


i want to know that what’s the difference between Quick Mode AND Shmoo Mode? why their result is quite different…
i am even confused with the concept. :">

Completely tangential question: How did you get eight GTX 470 cards into one computer???

just on a server…not a general computer…

Out of interest, what server hardware do you use? The only one I can think of is the Tyan 4U barebone TYAN FT72B7015 which featrues two 5520s, each supporting 36 PCIe v2 lanes, and 4 x x16 lanes are connected to a PLX PCIe switch to provide x16 lanes to all of the 8 PCIe slots.

I wonder how much your measured performance depends on the “directness” of the PCIe connection, i.e. how many hubs between endpoints?

Also, what happens in your setup if you wanted to transfer data in parallal?

Does the bandwidthtest exercise the GPUs in series?

Cheers,

peter

What server? I’ve seen motherboards with enough slots to run this many GPUs, but without the physical space to hold 8 double-width cards. Does this computer have riser cards or something?

Big mistake! Never tell them you have more than 2 cards in one rig! :) Now they are going to ask you a bunch of questions and they want them answered immediately Ahnuld’s accent lol

Give that this setup sounds physically improbable, yeah I’m interested. :)