Efficient BERT: Finding Your Optimal Model with Multimetric Bayesian Optimization, Part 2

Originally published at: https://developer.nvidia.com/blog/efficient-bert-finding-your-optimal-model-with-multimetric-bayesian-optimization-part-2/

This is the second post in this series about distilling BERT with multimetric Bayesian optimization. Part 1 discusses the background for the experiment and Part 3 discusses the results. In my previous post, I discussed the importance of the BERT architecture in making transfer learning accessible in NLP. BERT allows a variety of problems to…