Efficient BERT: Finding Your Optimal Model with Multimetric Bayesian Optimization, Part 1

Originally published at: Efficient BERT: Finding Your Optimal Model with Multimetric Bayesian Optimization, Part 1 | NVIDIA Technical Blog

This is the first post in a series about distilling BERT with multimetric Bayesian optimization. Part 2 discusses the set up for the Bayesian experiment, and Part 3 discusses the results. You’ve all heard of BERT: Ernie’s partner in crime. Just kidding! I mean the natural language processing (NLP) architecture developed by Google in 2018.…