Applying Transformers to Legal and Financial Documents for AI Text Summarization
During the first six chapters, we explored the architecture of the Transformer and how to train transformers. We also implemented pretrained models that could perform downstream tasks with fine-tuning. Finally, in Chapter 6, Text Generation with OpenAI GPT-2 and GPT-3 Models, we discovered that OpenAI has begun to experiment with zero-shot models that require no fine-tuning.
The underlying concept of such an evolution relies on how transformers strive to teach a machine how to understand a language and express itself in a human-like manner. We have gone from training a model to teaching languages to machines.
Raffel et al. (2019) designed a transformer meta-model based on a simple assertion: every NLP problem can be represented as a text-to-text function. Every type of NLP task provides some kind of text context that generates some form of text response.
A text-to-text representation of any...