CUDA area of application

Hi all,

I have just starting with CUDA. I bought a laptop with the Nvida GeForce 8200m G chip. Ran most of the sample app fine except for one. Here’s my question, where can I best apply CUDA to? Which class of problems outside of scientific research is it best applied to? I am trying to think if it can be applied to Financial applications other than a few on the option/ derivative pricing. I working with databases alot, particularly with data warehouse. I wonder how CUDA can be applied to day-to-day application development and accelerate applications performance.

Basically, I want to know which class of problem CUDA is best suited to, can it be applied to on a more practical level other than scientific research…etc.

Thanks for sharing your thoughts.

Gerry

PS: Sorry for the cross post, I originally posted this question in the CUDA Vista forum then I came across this forum, which I think is more appropriate for this question.
Bear with, I am new here.

Sounds like you have a solution looking for a problem!

This should give you an idea of the application areas that CUDA has been successfully used in:
http://www.nvidia.com/object/cuda_home.html

The problem needs three characteristics:

  1. You can divide it at least a few thousand pieces, without too much dependancy between pieces (some dependancy is fine, esp within a block)

  2. The pieces can be bundled into small groups (ie, a 32-thread warp) where the pieces don’t constantly diverge in their flow control (some branching is fine)

  3. The problem is ammendable to the GPU memory sub-system. This means the aforementioned warps should usually fetch data in a coalesced way (ie, a particular load instruction should target nearby memory locations). And your problem can make use of the various on-die SRAMs.

Criterion 1 is self-explanatory. Criterion 2 is often satisfied whenever criterion 1 is. It’s number 3 that gives CUDA developers the most thought, and there is a lot of flexibility there (in reshaping the algorithm). It’s the key matter for optimization and the deciding factor between a 4x speedup and a 100x.

Actually, the OP wants to go against the grain of all of these.

I don’t know that CUDA is useful for day-to-day application development. It is, however, useful in specific scenarios like alex mentioned; if you think about how vector processing instructions make normal CPU calculations faster (by processing several pieces of data in one instruction, vs. running an instruction for each), that is similar to how CUDA can speed up your application. Can you identify any scenarios where you are running simulations, solving systems of equations, or anything else that is very math-intensive? Problems like that are the ones where CUDA generally gives an excellent speed boost over the CPU.

It’s not for every application though; read through the programming guide (it’s not too long) and perhaps you’ll get an idea of how you could speed up certain applications in your line of work.

I think this is an interesting question.

What sorts of things do you deal with that are bounded by performance (and not memory or disk space)? Something like data mining might be accelerated well.

If you’re working with databases a lot, then perhaps you have some experience with SQLite. I had the idea a while back that SQLite might be ported to CUDA, so that when a database is created, the data is stored in some format in the memory of the graphics card. Then, when queries are run against the database, the parallel nature of CUDA may allow some interesting speedups in the search process, particularly for very complex queries with lots of sorts and joins and so forth.

Perhaps that would make an interesting problem to work on? The SQLite source code is in the public domain, and it is very widely used, so if you manage to get something working through CUDA (and get some kind of speedup), it would be useful to all kinds of developers.

EDIT: Another idea might be to use CUDA to speed up compiler performance. Not the actual compiling part, mind you, but things like dependency checks take a long time on the CPU for very large projects, and shouldn’t be too difficult to parallelize (since it’s really just doing some graph theory work, and there is already CUDA code available for both dense and sparse matrix math).

Data-parallel algorithms can be accelerated on CUDA.

Actually I want to buy a laptop which has Geforce 8200m G and I’d like to have a measure of its computing power. Would you do me a favor and send me the specifications. Just run the Device Query sample from the SDK. By yhe way, which sample code cant you run on it?
thanks in advance