xiaoxi
June 30, 2011, 6:32am
1
Anybody have idea about how to find the frequency of each char in a string?
For example: In aabbcd, I need to figure out frequency of characters: a: 2, b: 2, c: 1, d: 1
My implementation is as below, but it doesn’t work, and there are multiple threads waiting for writing distribution[index], it is not so efficient.
Does anybody have better idea? Thanks a lot
global
void countblock(unsigned char *d_buf, unsigned int length, unsigned *output )
{
unsigned tid = threadIdx.x +
blockIdx.x * blockDim.x +
blockIdx.y * blockDim.x * gridDim.x;
unsigned char index = d_buf[tid];
atomicAdd(&distribution[index], 1);
__syncthreads();
if(tid < 257)
output[tid] = distribution[tid];
__syncthreads();
}
brano
June 30, 2011, 8:37am
2
Anybody have idea about how to find the frequency of each char in a string?
For example: In aabbcd, I need to figure out frequency of characters: a: 2, b: 2, c: 1, d: 1
My implementation is as below, but it doesn’t work, and there are multiple threads waiting for writing distribution[index], it is not so efficient.
Does anybody have better idea? Thanks a lot
global
void countblock(unsigned char *d_buf, unsigned int length, unsigned *output )
{
unsigned tid = threadIdx.x +
blockIdx.x * blockDim.x +
blockIdx.y * blockDim.x * gridDim.x;
unsigned char index = d_buf[tid];
atomicAdd(&distribution[index], 1);
__syncthreads();
if(tid < 257)
output[tid] = distribution[tid];
__syncthreads();
}
Hi,
you can take a look at this:
http://developer.download.nvidia.com/compute/cuda/1_1/Website/projects/histogram256/doc/histogram.pdf
http://csce.uark.edu/~jgauch/library/papers/Shams.2007.pdf
I guess you have 3 different approaches:
Atomic on global
Atomic on shared
Sort → Run Length Coding ( two kernels and no use of atomic)
If you know anything about your data-set you could improve the algorithm.
Is it possible for the string to contain all possible characters? or only alphabetic?
One thing to notice is that the ASCII ‘a’ is not 0, so in your example you need to allocate the memory needed otherwise you will hit outside the array.
In your example “aabbcd” you will get [2,2,1,1] in position [97,98,99,100].
As brano already said, your problem looks a lot like a histogram problem. So you can also take a look at our paper about GPU histogramming available via: [url=“http://dx.doi.org/10.1145/1964179.1964181 ”]http://dx.doi.org/10.1145/1964179.1964181[/url] . You can find an example implementation on our website: [url=“PARsE | Research | Algorithms and tools ”]http://parse.ele.tue.nl/research/tools[/url] .
Sort the string, and then do a keyed reduction. Here’s some example code using thrust [1]:
#include <thrust/device_vector.h>
#include <thrust/sort.h>
#include <thrust/reduce.h>
#include <thrust/iterator/constant_iterator.h>
#include <string>
#include <iostream>
int main()
{
using namespace thrust;
std::string h_input = "aabbcd";
// move the input string to the device
device_vector<char> d_input(h_input.begin(), h_input.end());
// sort the input to bring similar keys together
sort(d_input.begin(), d_input.end());
// allocate storage for the resulting histogram
device_vector<char> d_keys(d_input.size());
device_vector<int> d_count(d_input.size());
// count the frequency of each character key
pair<device_vector<char>::iterator, device_vector<int>::iterator> new_end =
reduce_by_key(d_input.begin(), d_input.end(),
thrust::make_constant_iterator(1),
d_keys.begin(),
d_count.begin());
// eliminate extra unneeded storage in histogram
d_keys.erase(new_end.first, d_keys.end());
d_count.erase(new_end.second, d_count.end());
// print the result
std::cout << "character frequency:" << std::endl;
for(int i = 0; i < d_keys.size(); ++i)
{
std::cout << d_keys[i] << ": " << d_count[i] << std::endl;
}
}
Here’s the output:
yucky:~ jared$ nvcc character_frequency.cu -run
character frequency:
a: 2
b: 2
c: 1
d: 1
[1] Google Code Archive - Long-term storage for Google Code Project Hosting.
xiaoxi
July 9, 2011, 10:36pm
5
Thank you all, your suggestions are helpful :)