Word stemming, good match for the GPU?

consolejoker · January 19, 2008, 5:45pm

Looking at the classical Porter Stemming algorithm (the C version can be found here: http://tartarus.org/martin/PorterStemmer/c.txt) I am looking to adapt it to the GPU. It seems there are a few avenues that can be taken but it still doesn’t scream out as a perfect match.

Any thoughts on this one?

seibert · January 19, 2008, 9:54pm

How are you approaching the parallelism? The simplest approach would be to load a large set of words into memory, and let each thread handle one word. The variable sized inputs and outputs will be a bit of a challenge. Sorting the words by length, so blocks are likely to have the same sized input words will help. This doesn’t solve the output problem, since same sized input words will have differing lengths of output stems. Perhaps staging output in shared memory will solve that. (Or just padding out the output words with nulls to ensure the output is uniform in size.)

consolejoker · January 24, 2008, 2:01pm

Sorting is out of the question due to the data format and delivery constraints. I have given it some thought and don’t think that the size of inputs and outputs really will matter. I’m going to attack the problem from a different angle. I’ll let you know how it goes, thanks for the feedback.

Topic		Replies	Views
similar string search possible with CUDA? CUDA Programming and Performance	10	7175	December 12, 2008
Lexicographic Sorting - Poor performance CUDA Programming and Performance	42	6481	July 2, 2010
Need a advance CUDA Programming and Performance	0	495	May 4, 2018
How do I sort using CUDA? CUDA Programming and Performance	2	5282	July 9, 2019
why doesn't this dog run? Legacy PGI Compilers	11	5047	March 9, 2011
Algorithm Strategy brainstroming.. Please help me choose the best algo.. CUDA Programming and Performance	5	1767	December 26, 2009
CUDA and strings CUDA Programming and Performance	0	920	March 27, 2018
Multi-GPU Programming with Standard Parallel C++, Part 1 Technical Blog	0	477	April 18, 2022
execution time on CPU and GPU CUDA Programming and Performance	6	1212	February 26, 2015
Accelerating permutations CUDA Programming and Performance	11	4643	January 16, 2016

Word stemming, good match for the GPU?

Related topics