Data Mining - Association Rules Flashcards
What are itemsets in association rules?
Possible combinations of items. Not necessarily what people buy.
-> It can also be a single item.
How are Rules represented?
IF-THEN format.
If -> antecedent
Then -> consequent
Which two steps are there in association rules process
- Generation of Frequent Item sets (apriori)
2. Selecting strong rules
What are frequent item sets?
You can not check all combinations of items. Therefore, we only look at the frequent item sets.
What is the criterion for frequent item sets?
Support
Support(A and C) = frequency (A and C)
Easy computation: you look in your table how often that itemset is there in the records.
- You can divide by the total number of records in the table to get a percentage
How does the process of generating the most frequent item set go?
You set a minimum suport.
You check for single item item sets. above that support.
Then you check for double item item sets above that support and so on.
How do you measure how strong those associations are?
- Support
- Confidence
- Lift
How do you calculate confidence?
Frequency A and C happen / Frequency A happens
How do you calculate lift?
Confidence / (frequency C / n)
What does a lift of > 1 indicate?
There is a dependent relation between the antecedent and consequent
What does a lift of <1 indicate?
The presence of the antecedent has a negative effect on the presence of the consequent.
What does a lift of approx 1 indicate?
The antecedent and consequent are independent.