Introduction to Neural Machine Translation with GPUs (part 1)

Originally published at:

Note: This is the first part of a detailed three-part series on machine translation with neural networks by Kyunghyun Cho. You may enjoy part 2 and part 3. Neural machine translation is a recently proposed framework for machine translation based purely on neural networks. This post is the first of a series in which I will explain a simple…

Nice post! Thanks!

One equation is missing. Looks like latex error.

Thanks, I've fixed it. Wordpress latex is tricky...

This is fantastic. Thanks! Great to include the extra papers as jumping off points.

Awesome. I can't wait for the next posts!

Had one question, there is a function g_theta specified towards the end of the post to model the conditional probability of p(x|x_(less_than(t))), but it is not defined anywhere. Is g_theta the soft-max function? Also is g_theta used at any point in the training?

I found this post so interesting, thank you for sharing! If I want to cite this (or the next two posts), what's the best thing to do?

Cite the NVIDIA Developer Blog, along with authors and title, as you would any source.