-
Book Overview & Buying
-
Table Of Contents
RAG from First Principles
By :
Anna: Lewis, the evaluation of traditional machine learning models is relatively straightforward. For example, classifiers or object detection systems have clear criteria for right and wrong. However, unlike these models, when it comes to RAG systems, if a client asks about its performance, I have a hard time giving a definitive answer.
Lewis: That’s true. The way RAG systems are evaluated differs from traditional machine learning models. Traditional model evaluations usually rely on clear quantitative metrics, such as classification accuracy, object detection precision, Gini coefficient, R-squared, F1 score, and confusion matrix, etc. In contrast, evaluating a RAG system requires separately assessing its two core components: the retriever and the generator. This not only involves evaluating the relevance of retrieved content but also whether the generated answers meet the user’s needs.

Figure 9.1: Conversation between Lewis and Anna on...