-
Book Overview & Buying
-
Table Of Contents
RAG from First Principles
By :
Retrieval post-processing technologies play a critical role in RAG system architecture, sitting between the retrieval and generation stages. Their purpose is to optimize the accuracy, relevance, and efficiency of retrieval results.
Re-ranking techniques employ various algorithms to reorder initial retrieval results in order to improve the ranking of relevant documents. For example, RRF is suitable for fusing results from multiple different rankers, showing remarkable performance especially when different retrieval strategies are used. It balances the results from each ranker, avoids the influence of a single ranker, and is simple and efficient. Cross-Encoder concatenates the query and document and inputs them into a pre-trained model (such as BERT), using full interaction to directly output relevance scores. ColBERT achieves token-level fine-grained matching via late interaction, balancing efficiency and accuracy. Cohere/Jina directly leverages commercial APIs (such as...