Applications Fit for GPU Processing Types of application that work weel with the platform

gtorgersen · December 18, 2009, 5:01pm

I have been watching the development of the GPU and am now very interested in the computational power that is available using a machine similar to a Tesla super desktop and the like. my question is I have developed an applicaion with will read in X amount of text files and create a record/structure for eah file. Once the datahas been read into the system calculation must be made comparing each file.

If I had 400000 documents to process the toal possible calculations would be: (400000^2)*25

(This is not the actual number but works for example.)

Each calculation would be a simple division operation on relatively small numbers.

Would this technology be efficient for this type of operation? Currently I distribute the processing over a 35 node system.

Thank in advance for the help.

SPWorley · December 18, 2009, 5:15pm

That’s not enough information to say. If you explain in more detail, it would help. “I read in X data and do a simple division” is too vague.

Ideally you’d describe your problem as a goal and a rough strategy, like “We have a set of 10K+documents of size ranging from 1KB to 10MB. We’re trying to find redundant duplications, or parts of files which are nearly identical. We do that by hashing each 1K of the file with an algorithm like SHA-1 and looking for collisions over the whole set of documents. Would a GPU help?”

In vague generalities, GPUs tend not to be great for text processing, because those tend to be limited not by computation but by simple disk and memory transfer times.

gtorgersen · December 18, 2009, 5:49pm

The size of the files is unkown variable. I am really only interesting in the computational aspect. I am not shingling hash codes to identify duplication. So to make it very easy I have to conduct the following:

I have a data table with 400000 records and each record has 30 associated numerical values. All of these values must be compared to the same values in each of the 400000 documents. NOTE: I know the math below is not correct but this is just for example purposes in reality I would have to decrement the number of computeations for each document by 1 as we go through the list.

(400000^2)*30 division calculations on numbers smaller than 10000

This would be the only process that I feel may suit this application plus the other processes just dont take that long to complete.

Is

SPWorley · December 18, 2009, 6:02pm

Where does division come in?

Is this something like you have a 30 dimensional vector per document and you’re looking at the dot product of each pairwise combination of vectors to find the highest correlations or something?

gtorgersen · December 18, 2009, 6:23pm

Yes thats correct. We are clustering documents basd upon the results of the 30 comparisons (all of the comparisons are mathmatical division operations).

SPWorley · December 18, 2009, 7:10pm

What is the comparison operation? dot products? (and you use the division to normalize the vectors?)

If brute force comparison is the search method and you’ve already boiled down each doc to its 30-D vector, then the GPU does sound very appropriate.
If the bottleneck is in processing the text… it’s usually not as GPU friendly because such summary/search algorithms in text aren’t always parallelizable, and they are bandwidth, not computation, limited.

I’ve done a lot of text processing research on the GPU. The fatal bottlenecks are often in chained hash computation or parsing, neither of which the GPU is happy about.

gtorgersen · December 18, 2009, 7:25pm

Basically we take record 1 and begin a division comparison on record 2. The results of the division must fall within a specified range. If not it is flag as a negative (the app allows for a few negatives) we are generally looking for a 90%+ positive result. We then begin to compare document 1 to document 3 and so on.

I thought it might be good for this. What do you suggest as an entry level card to test with. Ultimately we would go with the Feri multicard solution but that is a lot to spend if we dont know the results yet.

Thanks for the help BTW.

SPWorley · December 18, 2009, 9:10pm

I’ve been working on text processing algorithms in CUDA for a while now… both for string searching (with mismatches) and correlation measures.
It sounds like your problem is immune to some of the most common bottlenecks, so you should have good success with it.

Any card would work fine for you. You don’t need much storage or even bandwidth, so there’s no need for the professional Tesla or Quadros.
For testing, you could use any card you like, really.

If you are going to buy something, you might get a $100 GT240 (which is low wattage and fits into most PCs very easily) or a $350 GTX285 (which is about three times as fast, but needs more wattage and a bigger PC to hold the larger card.) If you’re ever GPU speed limited in the future you can scale all the way up to 8 GPUs in one PC for about $4000 for the whole PC.