Lecture 10 - Association Rules Flashcards

Question 1

Q

Association Rules pt1

Answer

A

Identify item clusters in event-based or transaction-based databases
Study of “what goes with what”
- Symptoms related to diagnosis
- Customers who bought X also bought Y

Association Rules also called: Market basket analysis or affinity analysis

Question 2

Q

Example Association Rules

Answer

A

Market basket databases

Consist of a large number of transaction records
Each record lists all items bought by a customer on a single-purchased transaction
Detect certain groups of items are consistently purchased together

Information can be used to

Make decisions on store layouts
Design the upcoming catalog
Identify customer segments based on buying patterns

Amazon uses information for recommendations!!

Question 3

Q

Rules

Answer

A

Represented in an IF-THEN format
- “IF” part: antecedent, “THEN” part: consequent
Both correspond to sets of items (called itemsets)
Itemsets are
- Possible combinations of items (e.g., products)
- Can also be a single item
- NOT records of what people buy
Antecedent and consequent are disjoint
- I.e., have no items in common

Question 4

Q

Example transaction

Question 5

Q

Finding Association Rules

Answer

A

One items has many association rules
Every transaction is one itemset

→ Supports several rules

Two-stage Process:

Generation of frequent itemises - i.e., Apriory algorithm
Selecting the strong rules - i.e., criteria for judging the strength of the rules

Question 6

Q

Generation of rules

Question 7

Q

Frequent Itemsets

Question 8

Q

Quiz 1

Question 9

Q

Apriori algorithm

Answer

A

Goal: generate the frequent itemsets

for k items:

User sets a minimum support criterion
Generate list of one-item sets
Drop the ones bellow the support criterion
Use the list of one-itemsets to generate the two-itemsets
Drop the ones bellow the support criterion
Use the list of two-itemsets to generate the three-itemsets
Drop the ones bellow the support criterion
…(continue until k-itemsets)

Question 10

Q

Assessment of rule strength

Answer

A

We need to measure the strength of the association implied by a rule

Measures:

Support
Confidence
Lift ratio

Question 11

Q

Confidence

Answer

A

Compares the co-occurence of items in antecedent and consequent to the occurrence of items in antecedent. Shows the percentage in which C appears with A.

Question 12

Q

Relationship of Support with Confidence

Answer

A

Support: (Estimated) probably that a transaction randomly from the database will contain all items in the antecedent and the consequent. P(hat) (antecedent AND consequent)

Confidence: (Estimated) conditional probability that a transaction selected randomly will include all the items in the consequent given that the transaction includes all the items in the antecedent. P(hat) (consequent | antecedent)

High value of confidence suggests a strong association rule, i.e., rule in which we are highly confident
Can be deceptive when antecedent and consequent are independent, e.g.,:
- Nearly all customers buy bananas and nearly all customers buy ice cream
- High confidence level of “IF bananas THEN ice-cream”
- Regardless of whether there is an association between the items

Question 13

Q

Lift Ratio

Answer

A

Better way to judge the strength of a rule
Compares the confidence of the rule with a benchmark value
Confidence: percentage of antecedent transactions that also have the consequent item set
Lift: ratio of confidence with benchmark confidence
Benchmark confidence: transactions with consequent as percentage of all transactions

Question 14

Q

Lift intuition

Answer

A

Lift is a value between 0 and infinity
Value>1 indicates that antecedent and consequent are dependent on each other, and the degree of which is given y its value
Value<1 indicates that the presence of antecedent will have negative effect on consequent
Value≈1 indicates that antecedent and consequent are independent and no rule can be derived from them

Question 15

Q

Alternative data representation

Question 16

Q

Example

Answer

Study These Flashcards

A

Question 17

Q

Lecture summary

Answer

Study These Flashcards

A

Association rules produce rules on associations between items from a data sets with transactions
Widely used in recommender systems
Most popular method is Apriori algorithm
To reduce computation, we consider only “frequent” itemises (i.e., support)
Performance is measured by confidence and lift
Can produce a profusion of rules; review is required to identify useful rules and to reduce redundancy

Question 18

Q

Lecture summary

Answer

Study These Flashcards

A

Association rules produce rules on associations between items from a data sets with transactions
Widely used in recommender systems
Most popular method is Apriori algorithm
To reduce computation, we consider only “frequent” itemises (i.e., support)
Performance is measured by confidence and lift
Can produce a profusion of rules; review is required to identify useful rules and to reduce redundancy

Lecture 10 - Association Rules Flashcards

(18 cards)