PCI card for CUDA will it work?

Dear All,

Sorry, I could not find the answer in documentation. I would like to purchase a PCI video card GeForce 8400 GS for CUDA development to begin with. I know it sounds crazy, but we do not have PCIe slots in our computers. I don’t need anything advanced for testing. My primary goal is to learn if CUDA is the right way to go for our specific tasks.

So, my question is: do you think that CUDA will work for a PCI card? Are there any problems to expect?

Thank you!

I have reasonable doubts about the feasibility. If I were in your position, I’d try to get a cheap Pentium 4 server with a PCIe slot, and a proper PCIe card for it. PCIe has been around long enough to be available in very cheap or free used computers.

P.s. - if anyone in class has a recent laptop, there’s a chance that it has a CUDA-compatible video card. For example, I’ve been using Macbook Pros as portable demo machines. Could be a nice workaround. Just to name the possibility.

P.p.s. - Have fun!

Thank you for prompt reply! I know that I can buy a motherboard with a PCIe slot. But I don’t want to spent too much to test if the idea works in principle. If it does, we will buy powerful modern computers with PCIe slot(s), but for now I want only to play with CUDA and learn if this is the right way to go for us.

Yeps - the thing is that I personally have no idea what CUDA + the CUDA drivers do to data transfers over a PCI bus instead of PCIe. It’s a complete gamble.

To try to put it more clearly - My general thoughts is that it’s more worthwhile to spend a few hours to find a free or cheap server box that has PCIe, which you know will work, than to order a PCI card and spend a few days potentially getting that to work :)

The cheap server box I’m thinking about is something along the lines of a crusty, five year old Pentium4 Dell box, or a homebuilt overclocking rig that’s collecting dust. There’s one in a nearby dorm, and we both know it! :ph34r:

This was discussed previously:

http://forums.nvidia.com/index.php?showtopic=77184

Apparently not even NVIDIA has tested the PCI cards with CUDA, and there was no report back as to whether it worked for that user. There was no reason for it not to work, but you will be blazing new territory. Give it a try if your parts dealer has a good return policy, and report back with the results. :)

(Also, this card will almost certainly be slower than your CPU, and so slow as to make extrapolation to the high-end cards very difficult. Don’t try to make any decisions based on the benchmark times you see.)

Thank you very much! Sorry, I did not see the thread where this problem was discussed. Somehow the search engine on the forum gave no results, and I did not read every post, of cause.

I understand that the card is not powerful and the bandwidth is low. I need the card primarily to check if the algorithm works and whether we can profit from GPU-based data analyzer.

Most likely I will buy that card and will report back the result of my tests. Thank you very much for your post! It is very helpful.

Just wanted to report back. I have got the GeForce 8400 GS PCI card, installed it, installed CUDA and compiled examples from NVIDIA SDK kit. Everything seems to work fine. At least I have not seen any issues so far.

Below is a printout of deviceQuery in case somebody else is interested in. It is funny though that the NVIDIA CUDA programming guide claims that the card has 2 CPUs (see Appendix A there), but I see only one CPU. I hope it was a misprint in the programming guide rather than my video card is defective…

Device 0: "GeForce 8400 GS"

  Major revision number:						 1

  Minor revision number:						 1

  Total amount of global memory:			 536150016 bytes

  Number of multiprocessors:				  1

  Number of cores:								8

  Total amount of constant memory:		  65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 8192

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  262144 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.40 GHz

  Concurrent copy and execution:				 No

Can you run bandwidthTest and bandwidthTest --memory=pinning and post the results? I’m curious to see how close the PCI bus gets to its theoretical limit.

Sure. Here are the results:

$ ./bandwidthTest

Running on......

	  device 0:GeForce 8400 GS

Quick Mode

Host to Device Bandwidth for Pageable memory

.

Transfer Size (Bytes)   Bandwidth(MB/s)

 33554432			   88.7

Quick Mode

Device to Host Bandwidth for Pageable memory

.

Transfer Size (Bytes)   Bandwidth(MB/s)

 33554432			   109.1

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes)   Bandwidth(MB/s)

 33554432			   4123.0

&&&& Test PASSED
$ ./bandwidthTest --memory=pinned

Running on......

	  device 0:GeForce 8400 GS

Quick Mode

Host to Device Bandwidth for Pinned memory

.

Transfer Size (Bytes)   Bandwidth(MB/s)

 33554432			   89.2

Quick Mode

Device to Host Bandwidth for Pinned memory

.

Transfer Size (Bytes)   Bandwidth(MB/s)

 33554432			   109.9

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes)   Bandwidth(MB/s)

 33554432			   4121.9

&&&& Test PASSED

Not bad, given that the (standard) PCI bus has a theoretical bandwidth of 133 MB/sec. (It’s also amazing to realize that current PCI-Express 2.0 cards have 50x more bandwidth.)