-
Book Overview & Buying
-
Table Of Contents
RAG from First Principles
By :
Modern generative models can handle significantly longer context lengths. If the knowledge base is no larger than 200,000 tokens (roughly equivalent to 500 pages of material), you can consider skipping the retrieval step and including the entire knowledge base directly in the prompt, without the need for chunking.
Figure 7.9: Diagram showing natural language model workflow, prompt growth, and challenges with emojis
However, when generative models are faced with overly long contexts, the problem of being “lost in the middle” can easily arise. This means it is necessary to effectively compress the knowledge base and make it more concise, so that the model can process this information more efficiently.
LangChain provides a Contextual Compression Retriever, which includes the following two components: