Association Rules and Apriori Flashcards

1
Q

What are frequent patterns in data commonly referred to as?

A

Frequent patterns are also known as association rules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two types of analysis done using frequent patterns?

A
  1. Frequent itemsets which leads to the discovery of associations and correlations among items. 2. Frequent subsequences which allows the discovery of patterns across time or positions in a dataset.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Market Basket Analysis?

A

Market Basket Analysis is the type of analysis that identifies sets of items that appear together in transactional datasets such as which wines are sold together with which dish in a restaurant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Apriori Algorithm used for?

A

The Apriori Algorithm is used to identify frequent itemsets in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name two other frequently used algorithms aside from Apriori.

A
  1. FPGrowth 2. Eclat.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the objective of frequent substructures?

A

The objective of frequent substructures is to find interesting subgraphs in data which can be combinations of frequent itemsets and frequent subsequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define frequent subsequences in the context of datasets.

A

Frequent subsequences refer to the discovery of patterns across time or positions in a dataset such as the sequential order of purchasing history.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some algorithms used for exploring sequences?

A
  1. GSP 2. Spade 3. PrefixScan.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What issues can arise with association rules?

A

Problems include redundancy too many rules making it difficult to find interesting patterns and too few rules if minimum support or confidence thresholds are too high.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are actionable rules in association rules context?

A

Actionable rules contain high-quality actionable information that can lead to insights and actions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the definition of support in association rules?

A

Support is the fraction of rules that occur in all observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does confidence measure in the context of association rules?

A

Confidence is the probability of a rule being correct for a new observation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is lift defined in association rules?

A

Lift is the ratio by which the confidence of a rule exceeds the expected confidence indicating how much more likely the right-hand side occurs when the left-hand side is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is required for data to find frequent subsequences?

A
  1. A timestamps or sequencing information to determine when transactions occurred relative to each other. 2. Identifying information such as customer ID to know which transactions belong to the same entity.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How are rules generated from frequent itemsets?

A

Rules are generated by calculating confidence and removing itemsets that do not meet the parameter criteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the significance of pruning in the Apriori Algorithm?

A

Pruning involves removing items or itemsets that do not reach the minimum support which helps streamline the analysis to only relevant data.

17
Q

What is the typical structure of association rules?

A

Association rules consist of two parts: Antecedents (Left-Hand-Sides LHS) and Consequents (Right-Hand-Sides RHS).

18
Q

What are common applications of association rules in business?

A

Common applications include cross-selling upselling product placement and understanding customer behavior.

19
Q

What is the disadvantage of the Apriori Algorithm?

A

Frequent itemsets require the entire dataset to be scanned which can be computationally expensive if itemsets are very large and minimum support is very low.

20
Q

Explain the concept of redundant rules in association analysis.

A

Redundant rules are rules that include other rules which do not provide new insights and should be removed to reduce complexity.

21
Q

What are typical parameters set in association rule mining?

A

Typical parameters include Minimum Support Minimum Confidence and Maximum Itemsets Length.

22
Q

What does ‘frequent subgraphs’ refer to?

A

Frequent subgraphs refer to common patterns in data that are represented as graphs which help to identify relationships between items.

23
Q

How does sequential pattern analysis differ from traditional association rule analysis?

A

Sequential Pattern Analysis considers not only associations but also the sequences of items where the order in time is important.

24
Q

What type of rules are classified as trivial rules?

A

Trivial rules are those that are already known by anyone familiar with the business such as well-established associations.

25
Q

What does the term ‘inexplicable rules’ refer to in association analysis?

A

Inexplicable rules are those that seem to have no explanation and do not imply any clear course of action.

26
Q

Describe the final step in the Apriori Algorithm process.

A

The final step is to generate rules by calculating confidence based on frequent itemsets and pruned lists retaining only those that meet the established parameters.