Doug Lautzenheiser

6%
Flag icon
Back in 2018, when the Bidirectional Encoder Representations from Transformers (BERT) paper first came out, people were talking about how BERT was too big, too complex, and too slow to be practical. The pretrained large BERT model has 340 million parameters and is 1.35 GB.33 Fast-forward two years later, BERT and its variants were already used in almost every English search on Google.
Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications
Rate this book
Clear rating