Tell me how to implement an optimal algorithm for counting characters in a large amount of text on the CUDA C. Or count the number of unsigned char and unsigned short. Maybe it’s a fairly simple example showing the acceleration of the GPU.

Now I’m using CUDA Thrust to accelerate the standard algorithms: sorting and binary search. But this is clearly not the best option.

thrust::sort(vec_sort.begin(), vec_sort.end() );
thrust::sequence(alphabet_temp.begin(), alphabet_temp.end());
thrust::upper_bound(vec_sort.begin(), vec_sort.end(),
alphabet_temp.begin(), alphabet_temp.end(),
alphabet_count.begin());
thrust::adjacent_difference(alphabet_count.begin(), alphabet_count.end(), alphabet_count.begin());
// Now the desired result in the alphabet_count

Perhaps such an algorithm is already implemented. But I could not find it.