BART stands for bidirectional autoregressive transformer, a reference to its neural network architecture. BART proposes an architecture and pre-training strategy that makes it useful as a sequence-to-sequence model (seq2seq model) for any NLP task, like summarization, machine translation, categorizing input text sentences, or question answering under real-world conditions. In this article, we'll focus on its summarization capabilities.
Here are the reasons of choosing BART over other models:
- Most Resilient to Real-World Noisy Data
- Acceptable Results Out-of-the-Box Across Many Domains
- Produces Grammatically Correct Summaries
- Overcome Limitations of GPT-3
References : https://arxiv.org/pdf/1910.13461.pdf
Other powerful models such as MoCa, PEGASUS, GPT 4, could also be used, however due to gpu limitations, we are using BART.