finding frequency of letters

Hi all,

does any one know, about how to find frequecy of letters in text through CUDA efficiently ?



You can try a reduction with the type of letter as bin. There is reduction example in the CUDA SDK.

If it’s your only task on the gpu, you will be better of with the cpu using OpenMP.

It depends how long is your text.

This is a very similar problem to image histograms - see the histogram sample in the SDK.

I meant the histogram example instead of reduction … reduction has nothing to do with bins :confused:

I agree with CapJo. If this is the only task, it’s entirely bandwith limited. And on decent CPUs/mainboards, PCIe bandwith will be smaller than memory bandwith, so that even the data transfer to the graphics card (without any computation) is more expensive than doing everything on the CPU. I’d even guess that that just one thread would be enough to saturate the memory controller, so that not even OpenMP could improve the speed (unless on a NUMA system).

Not to mention the fact that the texts are likely to come from a harddisk…

I agree, just implement it on the cpu and see what is the limiting factor. Here is a c++ program that should do it. If the limit is on your cpu, you can start thinking about gpu:





using namespace std;

int main()


vector<size_t> count_vector(255);//or whatever

ifstream the_file(“the_file_name”);

while (true)


char c; //might have to use the unicode character, depends on your file

the_file >> c;

if (the_file.eof())






I couldn’t think of any more inefficient way of reading a file, than reading character by character and

checking for an EOF condition after every character read. You’ll spend most of your CPU cycles in the

ifstream object instead of histogramming.

Naah shouldn’t matter as most posts suspect the harddisk is the limiting factor, and this example can proof that as it should be faster than the disk (if the ifstream buffer is large enough).

I wouldn’t be so sure… On my computer, your code can only read 21 MB/sec from a file that is already cached in memory. A version which reads blocks of 4096 bytes at a time can read 570 MB/sec. Most hard drives should be able to do much better than 21 (but obviously won’t hit 570 unless you have a striped set of SSDs), so the character-by-character method really could be leaving performance on the table.