most effective way to get a mobile CUDA gpu

mascarpone · September 20, 2008, 10:05pm

Hi!

I’m a student with understanding both of parallel programming and advanced numeric topics who would like to start programming with CUDA and Matlab under a linux environment. I’ll be working along with one of my professor, since i think that i might make CUDA (and more generally GPGPU) the subject of my thesis. Therefore i’ll need a mobile CUDA environment to bring with me. From my scarce understanding of IT, i deduce i have three possible ways:

A notebook with a geforce mobile. Unfortunately, very few laptop has a geforce card, and almost all are really heavy and with little autonomy.
A notebook without a geforce mobile, but with a dock capable of hosting a PCI card, like lenovo’s advanced dock. It’s really practical, since i could write the code, and then compile it on a desktop-dedicated gpu.
A desktop with a geforce card, and with a remote control like VNC. About this i’m worried about latency and every thing else… is this a practical way?

thank you for all your polite replies.

kristleifur · September 20, 2008, 10:20pm

I would get a development machine with a G200 card, and use Nomachine (NX) to remote-desktop onto it. Then, I’d try to get a notebook with a CUDA-capable card as well, but not sweat it too much.

Nomachine is wonderful - you won’t go wrong with it. (Unless you need to do heavy duty OpenGL display stuff - that’s more tricky.)

My POV on CUDA laptops: The Apple Macbook Pro line is nice, and the small Zepto computers are interesting. Light, 14" monitor, nice GPU.

mascarpone · September 20, 2008, 10:23pm

how much does this cost? remember i have to fund by myself the thesis <img src=‘http://hqnveipbwb20/public/style_emoticons/<#EMO_DIR#>/crying.gif’ class=‘bbc_emoticon’ alt=‘:’(’ /> <img src=‘http://hqnveipbwb20/public/style_emoticons/<#EMO_DIR#>/crying.gif’ class=‘bbc_emoticon’ alt=‘:’(’ />

thank you for the useful infos!!!

E.D_Riedijk · September 21, 2008, 5:36am

A GTX260 GPU is (I think) quite affordable. That gives you a little less RAM than GTX280, but still plently for most problems.

If you start now with CUDA, I would really advise to use a GT200 GPU. They are much, much better for CUDA than previous generations.

From Nomachine’s software you can get a free version for a single machine.

mascarpone · September 21, 2008, 10:56am

i was thinking about buying now a gt8800 for 93 bucks (it’s a deal at a local chain of pcshop) rather than being an early adopter… and when the technology will be more mature i would buy a SLI of the new GPU…

Does the GT200 chipset have any advantage for cuda, apart being faster? (like libraries, or SDK…)

Although i know very well C and MPI i think i will still take me 6 months to learn cuda, and another 3 months (at least) to rewrite the mathematical libraries i have written for MPI… so in 9 months i could buy a SLI of GT200 for nothing…

E.D_Riedijk · September 21, 2008, 11:11am

few things:

CUDA does not support SLI, you program each GPU separately.
GT200 architecture has new:
- double amount of registers. This in turn often leads to more than double the performance, but more importantly, more complex algorithms are possible
- double precision support. Some algorithms need double precision in some steps. So algorithms like that are not possible on older cards.
- shared atomics
- warp voting

For me personally, the most important is the double amount of registers, as some of my kernels have 70+ registers needed. I had a kernel where I needed to take the sinh() of a large number, which overflowed in single precision, but not in double precision. But I managed to rewrite the algorithm to not need the sinh. Otherwise I would have needed to use double precision.

kristleifur · September 21, 2008, 11:22am

Yes - most of all, the memory controller is way different. There used to be a whole headache in how you read and wrote from memory in the kernels, it had to be “coalesced” - each memory address had to be read by the correct thread.

If it wasn’t, performance tanked.

It’s a sizable brick of a concept to hold in your head while thinking about everything else that is CUDA.

I promise, G200 is worth it.

–

To rephrase - The G200 is not just a faster CUDA chip - it’s more flexible. It will change the way you work, for the better. You’ll be able to express yourself better in code. That it also runs faster is a nice bonus.

mascarpone · September 21, 2008, 3:14pm

thank you for the precious infos… choosing the right hardware now it’s really important…

now i’m thinking about such a pc:

intel p35 chipset (on abit motherboard)
Quad Core Q6600 (or Q9300, probably the former)
4 gb of ram (DDR2/DDR3? i dunno yet)
GTX 260
some crapty hard disk, ramdisk to be used

Around 800 bucks… Any more advices?

How’s the support of the GTX 260 under linux? I’m afraid that when i’ll buy the system i’ll find something bad… like that the nvidia driver don’t work well under Linux x86_64 <img src=‘http://hqnveipbwb20/public/style_emoticons/<#EMO_DIR#>/crying.gif’ class=‘bbc_emoticon’ alt=‘:’(’ />

thank you for all your precious advices!

seibert · September 21, 2008, 3:30pm

Linux support of CUDA is very good.

I currently run CUDA 2.0 with a GTX 280 + GT200 prototype board on RHEL5 x86_64 in one system, and a pair of 8800 GTX cards in another system. (I’m using AMD hardware, so I can’t comment on Intel motherboards.) Just make sure you install one of the supported Linux distributions listed on the CUDA download page.

kristleifur · September 21, 2008, 3:31pm

Get a p43 or p45 chipset - the difference is mostly PCI-Express 2.0 support, which means a hell of a lot faster transfers to/from the CUDA card. If P43/P45 boards are more expensive and you’re really REALLY short on money, I’d take a dual-core instead of the quad. P43 boards shouldn’t be very expensive right now, though.

(If you’re very lucky, you can maybe find a nice used X38 Xeon server board for cheap … that would work nicely, too, with PCI-E 2.0.)

Edit - actually, may I suggest you consider getting a G43 or G45 chipset with onboard video? If it costs the same, it may be worth it. That way you can use the CUDA card only for CUDA calculation – Running an X server etc. on the CUDA card slows things down a little bit.

I’m running Linux x86_64 and it’s fine. I still only have an 8800, but AFAIK the G200 cards work just as well under Linux.

Edit 2 - Be careful when choosing the power supply :) Total wattage is not completely enough, the right rails have to be able to supply the right amperage, I believe. Search around the net / these forums.

mascarpone · September 21, 2008, 3:50pm

Yeah sorry you’re right, i forgot the PCI-E 2.0 thank you, i’ll look for a xeon server board!

right, didn’t thought about.

Actually could i turn off the GTX 260 when i’m not compiling?

nice to hear!

ok, i’ll check it!

tmurray · September 21, 2008, 4:16pm

I’ve personally had good luck with Corsair power supplies. Also, my home machine uses an X38 motherboard and gets very good bandwidthTest results.

seibert · September 21, 2008, 4:19pm

The power management on the card should lower the power draw automatically when you aren’t running GPU code. I haven’t checked this myself with a power meter, but reviewers have reported this feature.

E.D_Riedijk · September 21, 2008, 7:03pm

Damn, how could I forget that… I think I am already used to it :D

And indeed, an integrated GPU + GTX260 looks to be an optimal option when running Linux.

When running windows, you need to make sure they are both NVIDIA. In Vista, you cannot get around the fact that the OS will steal memory from your card as I understand it.

alex_dubinsky · September 22, 2008, 4:35pm

What do you mean? You still have to think about coallescing on GT200… There’s just now a new class of “partly coallesced” accesses. On the flipside, the profiler doesn’t work and won’t tell you if you’re coallescing or not.

The doubled registers are nice, but you can run the same complex algorithms on older cards.

PCIe 2.0 is hardly important. Only some people even need it, and even if it turns out you do, again, it’s just a speedup.

The atomics features can, however, be important if you need them. They’re, for one, a feature and not just a speedup.

Now… I think the gtx260 is cheap and should be bought, but let’s not get too excited here. Eg, the expensive mobo can be done without. Do get a good PSU (but watch out for the many over-priced PSUs). In fact, depending on how much money you have and how long it’ll take you to complete your project, you may have the right idea to buy a cheap development rig now and invest in a performance-testing rig once everything’s done. NVIDIA will release new GT200 GPUs pretty soon, in fact.

mascarpone · September 23, 2008, 2:30am

for a better comparison… anyone has a link to the spec of the various motherboards?

i’ve done a bit of research, but i can’t find many things, eg: dimension of the shared memory for each multiprocessor… dimension of the register…

since i know yet the dimension of the problem i’ll face this datas will greatly help me to find which gpu best suites my needs :)

mascarpone · September 23, 2008, 5:59am

ok, after endless search on the web, here’s a possible list (price in euro):

Intel Q6600 160

Asus P5Q-EM, microatx with G45: 120

Ram: 4 gb ddr2 800 OCZ (should i go for 1033 instead?) 82

550 W quality PSU: 60

Antec mini atx P180 case: 60

Geforce GAINWARD GeForce 9600 GSO 384 Mb ddr3: 90 euro

Total: 593 euro, well done!

I didn’t choose the GTX 260 because now it’s 280 bucks the cheapest version, and in 6 months i bet i’ll find it for aroun 120 euros… don’t you think?

just waiting for your imput guys, to give the order

XFer · September 23, 2008, 9:24pm

Hi mate,

OK. Can be pushed to 3.0 GHz (333x9) and hardly breaks a sweat.

Have a look to Q9450 too: 2.5 GHz Penryn quad core, has 12MB of 2L cache and hardware idiv logic. A fast beast.

Ok. If you push the Q6600 to 3.0, you’ll want to run the memory at 667 to get synchro FSB/RAM. You should find the OCZ Platinum 1066 specimens at about the same price point you quote (got them from “RomaCC” in Rome).

I’d go higher here, if possible. 550W is about the minimum to reliably run a G92/G200.

Of course you can swap the 550W for a 650W when you’ll buy a 260 in 6 months, but why the double purchase?

Mmm, I recommend getting a faster card here.

You may find a 9800GT 1GB for 130 euro and it’s much faster.

Plus 1 GB of local memory is very nice when running CUDA: in CUDA you want to keep data on the device as much as possible, since feeding bytes through the PCIe is so slooow.

Fernando

mascarpone · September 23, 2008, 10:04pm

thank you fernando for your precious input… i share most of what you say… still thinking but i have sad news for you boys:

[url=“http://forums.nvidia.com/index.php?showtopic=78064”]http://forums.nvidia.com/index.php?showtopic=78064[/url]

Guys i have to ship a feasibility report by monday… and still i dunno the hardware list <img src=‘http://hqnveipbwb20/public/style_emoticons/<#EMO_DIR#>/crying.gif’ class=‘bbc_emoticon’ alt=‘:’(’ />

alex_dubinsky · September 24, 2008, 2:25am

Don’t worry about it, just buy it. You can do this endlessly.

I’d say though don’t buy OCZ ram, the brand is a waste of money, even if you overclock. (I don’t know how this overpriced ram business still exists. Years ago cpu/fsb couldn’t be overclocked without pushing the ram hard, and there was a need for the fastest chips. Now… wtf.)

For the same money, get a PSU with a lot more watts. Don’t pay more, though.

BTW, although I couldn’t find anything more about the g45 compatibility issue (a couple mentions, but it doesn’t seem to affect most people), I have to warn you that if you get an Intel integrated-graphics mobo, you won’t be able to use the graphics. Windows only supports a single video driver at a time (mostly). You could get NVIDIA integrated gfx.

Topic		Replies	Views
CUDA Laptop A discussion on Benefit-Cost Ratio. CUDA Programming and Performance	42	37259	July 2, 2009
Best, bang-for-the-buck, CUDA platform? ... Which? 9800 GX2, Tesla C870, new 2xx ... CUDA Programming and Performance	23	10595	July 15, 2008
board recommendation / headless dedicated / chipset tradeoffs CUDA Programming and Performance	18	10630	July 3, 2009
CUDA Screen freeze with 1 graphics Card CUDA Programming and Performance	37	51865	June 17, 2011
CUDA development cluster (using old filing cabinet!) Advice needed on hardware specification CUDA Programming and Performance	38	10338	October 4, 2010
Advice on first CUDA system CUDA Programming and Performance	13	2677	July 7, 2009
advice needed by a PhD student CUDA Programming and Performance	26	2843	December 4, 2011
Wishlist Place your considered suggestions here CUDA Programming and Performance	201	204313	April 13, 2009
Windows 7 no CUDA-capable device is detected CUDA Setup and Installation	23	19252	January 9, 2018
Using more than 1 CUDA card at a time. Physics simulations flat out flying on GPU CUDA Programming and Performance	12	12536	March 12, 2010

most effective way to get a mobile CUDA gpu

Related topics