WFST beam search algorithm

francisr · December 9, 2015, 8:39pm

I’ve seen a few papers mentioning that they’ve used GPUs to implement a beam search on WFST, but I wasn’t able to find any source code. Doesn’t anyone have an example?

njuffa · December 9, 2015, 9:57pm

Could you be a bit more specific, by citing relevant papers? Are you looking specifically for GPU-based open source code? An internet search for the terms WFST, GPU, and open source, returns multiple links that look promising, but not being familiar with WFST I cannot tell whether they would suit your needs.

francisr · December 9, 2015, 11:06pm

My application is speech recognition, where we need to do a beam search on a large graph. It is currently done on CPU in all the applications that I have seen, but there are some claims that doing that on GPU could bring a 20 fold improvement in speed.
On slide 12: http://www.nvidia.co.uk/docs/IO/147844/Deep-Learning-With-GPUs-MaximMilakov-NVIDIA.pdf it’s mentioned that “Beam search runs fast on GPU”, and links to this:
http://devblogs.nvidia.com/parallelforall/cuda-spotlight-gpu-accelerated-speech-recognition/ and which is linked to this research:
http://www.cs.cmu.edu/~ianlane/publications/SLT_JungsukKim.pdf

And also other ressources that mention it:
http://www.gpucomputing.net/sites/default/files/papers/3038/is2009.pdf

WFSTs are a quite versatile data structure, so I being able to do a beam search on them efficiently on GPU would be convenient for many application, so I hoped that I could find some code implementing it, or something similar.

njuffa · December 9, 2015, 11:32pm

Have you checked whether OpenFST offers GPU acceleration? I see references to a project called T³ that appears to offer some GPU acceleration, also something called Kaldi (on GitHub). With your knowledge of the field, you will likely have much better luck locating relevant projects.

Judging by the publication dates on some of the references I find, this seems to be pretty cutting edge research, so much of it may not have yet in fact crystallized into solid open-source packages. You might want to look at this as your chance to become famous by contributing some kick-ass GPU acceleration to one of the existing speech-recognition software projects :-)

francisr · December 9, 2015, 11:54pm

OpenFST doesn’t seem to offer anything on GPU, and T3 and Kaldi use GPUs only for doing a forward pass with neural networks to get acoustic probabilities.

The research done by CMU traces back to 2012, more than 3 years now, no that cutting edge in this field, and also in GPUs!

I guess that now I just have to find the time to become famous. :)

njuffa · December 10, 2015, 12:13am

I’d say check on this thread for a few more days. Someone who has deep insights into GPU use in speech recognition software may yet come along and give you the crucial pointer you are looking for.

I do not think three years is a terribly long time for research to show up in widely available software. After all, someone has to step up to the plate and do the work of re-shaping wild and wooly research code into something that can actually be used by more than one person or team and on more than one platform.

If you look to CUDA itself for an example, if memory serves, it took about three years between the published research on GPU computing at Stanford (Brook) and when the initial version of CUDA was made available to the public. And that was in the context of a company paying engineers to do the necessary work.

It is of course also possible that the relevant speech recognition researchers are now busy turning their work into commercial products, rather than open-source packages.