I am interested in understanding the applications behind the Apriori Algorithm. Given a datasource with a transaction id column and a list of items. How can we use the Apriori algorithm for extracting single-dimensional, single-level, boolean association rules? The end goal of this is to understand what items were bought together.
Step 1. Build C1, list of items and then L1
("1" after letter C says each item set in C includes on one item)
L1 is the cleaned version of C1, that is it is the same as C1 with some itemset removed because they do not satisfy a given condition.
Step 2. Build C2, list of items in 2 by 2, from joining L1 to itself
Step 3. Build C3, list of items with 3 items in every itemset from joining L2 to itself
Step 4. Build C4, list of items with 3 items in every itemset from joining L3 to itself
After following this process how do we apply association rules for each item set in L3 to generate all of it's non-empty subsets?
Finally, how do we extract association rules from each subset?
Once we have determined the association rules we can calculate the confidence levels via
c = Frequency(item1 union item2)/Frequency(item1)
Finally, how can we can select association rules that satisfy the condition above?