Apriori is a classical algorithm that is used to mine frequent itemsets to derive various association rules. It will help set up a retail store in a much better way, which will aid revenue generation.
The anti-monotonicity of the support measure is one of the prime concepts around which Apriori revolves. It assumes the following:
- All subsets of a frequent itemset must be frequent
- Similarly, for any infrequent itemset, all its supersets must be infrequent too
Let's look at an example and explain it:
Transaction ID | Milk | Butter | Cereal | Bread | Book |
t1 | 1 | 1 | 1 | 0 | 0 |
t2 | 0 | 1 | 1 | 1 | 0 |
t3 | 0 | 0 | 0 | 1 | 1 |
t4 | 1 | 1 | 0 | 1 | 0 |
t5 | 1 | 1 | 1 | 0 | 1 |
t6 | 1 | 1 | 1 | 1 | 1 |
We have got the transaction ID and items such as milk, butter, cereal, bread, and book. 1 denotes that item is part of the transaction and 0 means that it is not.
- We came up with a frequency table for all the items along, with support (division by 6):
Items | Number of transactions | Support |
Milk | 4 | 67% |
Butter | 5 | 83% |
Cereal | 4 | 67% |
Bread | 4 | 67% |
Book | 3 | 50% |
- We will put a threshold of support at 60%, which will filter out the items...