There are some concepts about apriori that need to be understood before going further in this chapter: association rules, itemsets, support, confidence, and lift.
An association rule is the explicit mention of a relationship in the data, in the form X => Y
, where X
(the antecedent) can be composed of one or several items. X
is called an itemset. In what we will see, Y
(the consequent) is always one single item. We might, for instance, be interested in what the antecedents of lemon are if we are interested in promoting the purchase of lemons.
Frequent itemsets are items or collections of items that occur frequently in transactions. Lemon is the most frequent itemset in the previous example, followed by cherry coke and chips. Itemsets are considered frequent if they occur more frequently than a specified threshold. This threshold is called minimal support. The omission of itemsets with support less than the minimal support is called support...