Is BERT a Game Changer in NLP?


BERT  (Bidirectional Encoder Representations from Transformers) is an open-sourced NLP pre-training model developed by researchers at Google in 2018. It has inspired multiple NLP architectures, training approaches and language models, including Google’s TransformerXL, OpenAI’s GPT-2, ERNIE2.0, XLNet, and RoBERTa. 

For instance, BERT is now used by Google Search to provide more relevant results. And it can also be used in smarter chatbots with conversational AI applications, expects Bharat S Raj. 

A direct descendant to GPT (Generalized Language Models), BERT has outperformed several models in NLP and provided top results in Question Answering (SQuAD v1.1), Natural Language Inference (MNLI), and other frameworks.

It’s built on pre-training contextual representations — including Semi-supervised Sequence Learning (by Andrew Dai and Quoc Le), ELMo (by Matthew Peters and researchers from AI2 and UW CSE), ULMFiT (by founder Jeremy Howard and Sebastian Ruder), the OpenAI transformer (by OpenAI researchers Radford, Narasimhan, Salimans, and Sutskever), and the Transformer (Vaswani et al).

What makes it’s unique from the rest of the model is that it is the first deeply bidirectional, unsupervised language representation, pre-trained using only a plain text corpus. Since it’s open-sourced, anyone with machine learning knowledge can easily build an NLP model without the need for sourcing massive datasets for training the model thus saving time, energy, knowledge and resources.

Finally, BERT is pre-trained on a large corpus of unlabelled text which includes the entire Wikipedia (that’s about 2,500 million words) and a book corpus (800 million words).

For more information, read the full article from Bharat S Raj, Manager at Sutherland, Digital Business Strategy Consultant: 

Site maintained by OW2