Efficient BERT: Finding Your Optimal Model with Multimetric Bayesian Optimization, Part 3

Originally published at: https://developer.nvidia.com/blog/efficient-bert-finding-your-optimal-model-with-multimetric-bayesian-optimization-part-3/

This is the third post in this series about distilling BERT with multimetric Bayesian optimization. Part 1 discusses the background for the experiment and Part 2 discusses the setup for the Bayesian optimization. In my previous posts, I discussed the importance of BERT for transfer learning in NLP, and established the foundations of this experiment’s…