Mixed Precision Training for NLP and Speech Recognition with OpenSeq2Seq

jwitsoe · October 9, 2018, 1:21pm

Originally published at: Mixed Precision Training for NLP and Speech Recognition with OpenSeq2Seq | NVIDIA Technical Blog

The success of neural networks thus far has been built on bigger datasets, better theoretical models, and reduced training time. Sequential models, in particular, could stand to benefit from even more from these. To this end, we created OpenSeq2Seq – an open-source, TensorFlow-based toolkit. OpenSeq2Seq supports a wide range of off-the-shelf models, featuring multi-GPU and…

anon17824831 · December 26, 2018, 7:15am

Did you get NaN gradient problem when training model?
I mean `Vanishing Gradient Problem`