device memory bandwidth issues with 177.67 lower then expected

mattb3 · August 25, 2008, 2:43am

I have two machines with CUDA devices. One is running Ubuntu 7.04 using a Tesla S870, and the other is running Ubuntu 8.04 with a Tesla C870. Both are using CUDA 2.0 final and exhibit the same device to device memory bandwidth numbers for the various driver versions.

Here are the results of bandwidthtest (as best as I can remember) with the various drivers.

177.67 - 31 GB/s
177.13 - 52 GB/s

174.?? and below I get about 65 GB/s.

Although I swear I used to get closer to 70 GB/s when using CUDA 1.1, but I haven’t verified that recently. Anyway, has any else observed this issue? I’m thinking of switching to CentOS, because I get the feeling it’s better tested by NVIDIA.

If anyone else can run the bandwidth test with CUDA 2.0 and recent drivers I would definitely appreciate the sanity check.

tmurray · August 25, 2008, 3:22am

Bandwidth as reported by bandwidthTest is slightly lower (~5-10%) from 1.1 to 2.0. You should not be seeing dramatic decreases like that, though.

Mu-Chi_Sung · August 25, 2008, 4:59am

I just upgrade to final 2.0 and run the bandwidth test. My system is running Ubuntu 8.04 with 2.6.24.19 kernel, and the cuda display driver version is 177.67. I have 8800GTX and 8600GT on 1st and 3rd slot on ASUS Striker Extreme MB.

On 8800GTX, I observed different d2d bandwidth is 8~21% slower (from 70GB/s => 65GB/s ~ 55GB/s), and also the measured bandwidth is not as stable as before (ranging from 55GB~65GB now, while it remains 70GB/s in 2.0b) (usually 1 out of 5 runs could reach 65GB/s)

Here is the result:

./bandwidthTest --device=0 --memory=pinned

Running on......

 Â  Â  Â device 0:GeForce 8800 GTX Â  Â  Â 

Quick Mode

Host to Device Bandwidth for Pinned memory

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 3149.4

Quick Mode

Device to Host Bandwidth for Pinned memory

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 2927.5

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 55315.5

&&&& Test PASSED

./bandwidthTest --device=0 --memory=pinned

Running on......

 Â  Â  Â device 0:GeForce 8800 GTX Â  Â  Â 

Quick Mode

Host to Device Bandwidth for Pinned memory

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 3156.1

Quick Mode

Device to Host Bandwidth for Pinned memory

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 2844.4

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 65286.1

&&&& Test PASSED

However, on 8600GT, which is G92-based core, the bandwidth is quite stable (I forgot the bandwidth number I have for 8600GT before, so I don’t know how many % loss here)

Running on......

 Â  Â  Â device 1:GeForce 8600 GT Â  Â  Â 

Quick Mode

Host to Device Bandwidth for Pinned memory

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 1738.3

Quick Mode

Device to Host Bandwidth for Pinned memory

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 1688.6

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes)	Bandwidth(MB/s)

 33554432 Â 15469.0

&&&& Test PASSED

Just wondering if 280GTX/260GTX (or other cuda card with higher d2d bandwidth) have the same bandwidth variation as my 8800GTX…anyone?

EDIT: a second thought, maybe the bandwidth variation is because my 8800GTX is used as primary display?

tmurray · August 25, 2008, 6:04am

I wouldn’t be surprised at all if that’s the case.

Vivek · August 25, 2008, 9:27am

I’m also facing similar bandwidth issues.

On RHEL4, I used to get 1.5GB/s host to device speed with 174.55. Now, with 177.67, I’m getting only 730MB/s!!! When I tried CUDA 2.0 with 177.67, the results are not very different.

Why such a drastic drop?!

Kravell · August 25, 2008, 9:30am

I also experienced bandwidth variation with my Tesla C870 running fedora 8. My device to device bandwidth was with various drivers :

169.09 and Cuda 1.1 : 64000 MB/s
177.13 Cuda beta2 : 57000 MB/s
177.67 Cuda 2 : 60000 MB/s

In my case, the final release of cuda2 partially fixed the decrease.
However, I didn’t experience any performance change in my memory-bound kernels, so I wonder if
its only a modification in the way the measure is done and not on the actual performance.

E.D_Riedijk · August 25, 2008, 4:02pm

Will this be fixed in future releases, or was it a correctness issue? What I mean is, I would like to be able to write in my final report on CUDA that performance is getting better with every new release ;)

g000fy · October 5, 2008, 7:54am

en1gma@en1gma-desktop:~/NVIDIA_CUDA_SDK/bin/linux/release$ ./bandwidthTest --device=0 --memory=pinned

Running on…

device 0:GeForce 8800 GTS 512

Quick Mode

Host to Device Bandwidth for Pinned memory

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 3115.4

Quick Mode

Device to Host Bandwidth for Pinned memory

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 2668.8

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 49887.0

&&&& Test PASSED

Press ENTER to exit…

en1gma@en1gma-desktop:~/NVIDIA_CUDA_SDK/bin/linux/release$ ./bandwidthTest --device=1 --memory=pinned

Running on…

device 1:GeForce 8800 GTS 512

Quick Mode

Host to Device Bandwidth for Pinned memory

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 3181.0

Quick Mode

Device to Host Bandwidth for Pinned memory

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 2951.0

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 49817.1

&&&& Test PASSED

Press ENTER to exit…

dont know if this info relavent but i tested on both my vcards and the results are almost the same

device0 is my primary device hooked to a vga monitor

device1 isnt connected to anything

im running the NVIDIA-Linux-x86-177.67-pkg1.run driver

and using NVIDIA_CUDA_Toolkit_2.0_ubuntu7.10_x86.run

on a ubuntu 8.04-i386-desktop OS

Topic		Replies	Views
Performance drops down with 177.84 driver Old 174.55 driver has better performance CUDA Programming and Performance	22	17278	September 1, 2008
bandwidth test CUDA Programming and Performance	9	19103	March 24, 2009
Bandwidht Usage CUDA Programming and Performance	16	8892	October 30, 2008
Host <-> Device bandwidth slow CUDA Programming and Performance	6	4176	March 6, 2008
Host<-> device bandwidth problems slow and intermittent bandwidth on linux CUDA Programming and Performance	9	6706	January 8, 2008
Memory bandwidth CUDA Programming and Performance	31	38387	October 5, 2007
Bandwith problems with S870 and 177.67 CUDA Programming and Performance	18	12125	January 26, 2009
CUDA deviceQuery and Bandwidth Test on Tx2 are weird Jetson TX2	10	2592	October 18, 2021
Host to Device Memroy Bandwidth CUDA Programming and Performance	18	7983	September 12, 2008
bandwidthTest anomaly! CUDA Programming and Performance	4	10877	July 31, 2009

device memory bandwidth issues with 177.67 lower then expected

Related topics