Mining Association Rules COPY Flashcards

1
Q

What’s the motivation for studying Mining Association Rules?

A

To look for interesting relationships between objects in large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are we trying to do when studying Mining Association Rules?

A

Find all rules that correlate the presence of one set of items with another set of items E.g., 80% of customers who buy {diapers} tend to buy {beer, milk}.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Provide Formal Notations of the following

item

itemset

k-itemset

transaction

transaction dataset

A
  • An item: an item in a basket
  • An itemset is a set of items. n E.g., X = {milk, bread, cereal} is an itemset.
  • A k-itemset is an itemset with k items.
  • A transaction: items purchased in a basket n it may have TID (transaction ID)
  • A transactional dataset: A set of transactions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do we mean when we say X-> Y in Mining Association Rules?

A

If they buy X, they will buy Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is support and confidence in Association Rule Mining?

A

Support is a measure of how frequent an item appears in the set.

E.g Half of the people at Woolworths have milk in their basket. The support is 0.5 or 50%.

Confidence is a measure of how likely an item is bought if another item is also bought (X->Y). Of the people who buy milk, 80% of people buy bread as well. Confidence is 0.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do we call association rules that satisfy both the Min_Support and Min_Confidence?

A

These are Strong Association Rules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the minimum support mean?

A

The minimum frequency we care about.

If minimum support equals 3.

Any item that occurs only 2 times is not important for our analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the conditional Probability formula for confdence?

A

Confidence (X -> Y) = P(Y | X) = P(X U Y) / P(X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the goal of association rule mining? What do we minimally want for a rule?

A

The goal of association rule mining is to find all rules having

  1. support ≥ min_sup threshold
  2. confidence ≥ min_conf threshold
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What algorithms do we you use for Mining Association Rules?

A
  1. Apriori Algorithm
  2. Frequent Pattern (FP) Growth Algorithm
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the two steps in Mining Association Rules?

A
  1. Frequent Itemset Generation

– Get all itemsets whose support ≥ minsup

    • Generate high confidence rules from each frequent itemset
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the principle of the Apriori Algorithm?

A

If an itemset is frequent, then all of its subsets must also be frequent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain how to

perform the Apriori Algorithm on this Itemset

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some factors that affect the complexity of the Apriori Algorithm?

A
  • The choice of minimum support threshold
  • Dimensionality (number of items) in the data set
  • Size of database
  • Average transaction width
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly