Weeks 8-9: Association Rules & Temporal Data Mining Flashcards

1
Q

Association Rule

A

An expression in if-then format.

The if is the precondition or antecedent. It has a series of tests.

The then part is the conclusion or consequent. It assigns values to one or more attributes.

Association rules may incorporate any attribute or combination of attributes as the target.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bottom-up Generate-and-test

A

A procedure for generating association rules that frequently occur in the rule’s right-hand side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Coverage/Support Count

A

The number of instances covered by the rule.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Support

A

Proportion of instances covered by the rule.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Confidence/accuracy

A

Proportion of instances that the rule predicts correctly over all instances satisfying the precondition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Apriori Algorithm

A

This is an algorithm for generating association rules.

  1. Find all 1-item sets, calculate their coverage and discard any 1-item sets below the predefined min coverage value.
  2. Find all 2-item sets by making pairs of 1-item sets and retain only 2-item sets with coverage above the minimum coverage value.
  3. Similarly, find all 3-item sets, 4-item sets, and so on.
  4. Repeat until no larger item sets are possible.
  5. For each item set, produce as many association rules as the number of ways to split the relevant items between the precondition and the conclusion.
  6. For each rule, evaluate the confidence and retain if the rule meets the minimum requirement.

Note that the algorithm implements the property that, if an item-set isn’t frequent, then all its supersets are also not frequent. The algorithm produces k+1 items only from k-item sets. The worst-case running of the algorithm is exponential with respect to the number of attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

FP-Growth Algorithm

A
  1. Count each item frequency and sort in descending order.
  2. Perform a second pass of the data to create the FP-tree.
  3. Recursively process the tree to identify frequent item sets.

The tree structure allows for more efficient extracting of association rules from a dataset. Usually, it has better run time than the apriori algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Header Table

A

It appears on the left hand side of the FP-tree. It shows the frequencies of individual items satisfying the minimum coverage threshold, sorted in descending order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Lift

A

This metric measures the usefulness of a rule by comparing with respect to random guessing. It compares the confidence of the rule to the confidence of the null rule, which has the same right-hand side, but empty left-hand side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Contingency Table

A

There are two dimensions to this table.

Does the antecedent of the association rule appear in the transaction?
No: Absent
Yes: Present

Doe the consequent of the association rule appear in the transaction?
No: Absent
Yes: Present

The number of transactions is equal to the sum of the four resulting cells. If the cell for antecedent and consequent both being present is higher than the other cells, then this is a good indication that the association rule is good.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sequential Analysis/Sequential Pattern Analysis

A

This method adds the element of time to association analysis. It considers the sequences, not just the associations, of items, where the order in time is important.

As such, the support of an item set also depends on the order of the items. Generalised sequential pattern (GSP) is a suitable algorithm for sequence mining based on the apriori algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Link Analysis

A

Approach for analysing and exploiting relationships and connections between items. It utilises graphs, with items as nodes and relationships as edges. Can be both supervised and unsupervised in training.

Graphs can be directed, undirected, multigraph (when multiple edges between the same pair of nodes are allowed), and connected (if there’s a path between every pair of nodes).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Degree Centrality

A

This metric evaluates the relative importance of a node to the graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Closeness Centrality

A

This metric measures how close a node is to all the other nodes in the graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Time Series

A

A set of values obtained from measurements over time. A dataset may have a single or multiple time series.

T = (x_1,…,x_n)
x_t = time series value at time t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Decomposition

A

Traditional approaches decompose a time series into additive components:

x_t = T_t + C_t + S_t + \epsilon_t

T_t: Trend, variations of low frequency
S: Seasonality, regular predictable changes
C: Cycle, periodic variations
\epsilon_t: Error, captures uncertainty of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Moving Average (MA)

A

\hat{x}{t+1} = \frac{x{t-k+1}+…+x_{t-1} + x_t}{k}

18
Q

AutoRegressive (AR)

A

\hat{x}_{t+1} = \beta_0 + \beta_1 x_1 + \epsilon_t

19
Q

Exponential Smoothing

A

\hat{x}_{t+1} = \alpha x_t + (1 - \alpha)\hat{x}_t, for t \ge 1

\alpha = smoothing factor

20
Q

Representation

A

This is a challenge with time series data, as time series data reduces the dimensionality o the data.

21
Q

Similarity

A

This is a challenge with time series data, as one of the goals to to evaluate how similar a pair of time series are.

22
Q

Indexing

A

This is a challenge with time series data, as there needs to be an organising of time series to achieve low memory requirements and allow for faster querying.

23
Q

Non-Adaptive Representation

A

Common representation for all parts of a time series.

24
Q

Adaptive

A

Local properties are used to construct a non-uniform representation of a time series.

25
Q

Piecewise Linear Approximation

A

Approximates a time series using piecewise linear segments.

26
Q

Discrete Fourier Transformation (DFT)

A

Any signal can be represented with a finite number of sine/cosine waves.

27
Q

Singular Value Decomposition (SVD)

A

Represents the shape in terms of a linear combination of basic shapes, based on the variance of the dataset. In contrast to DFT which is local, SVD is a global transformation.

28
Q

Segmentation

A

Given a time series T with a large number n of time points, compute a model with k < n piecewise segments approximating T.

29
Q

Measuring Similarity

A

Whole series matching: compare two time series T and T’ using their entire length.

Subsequence matching: given a large sequence T and a query subsequence Q, find the T’s subsequences matching Q.

30
Q

Euclidean Distance for Time Series

A

It’s the euclidean distance between the points x_t(T) and x_t(T’).

This method is used in standardisation, but doesn’t account for acceleration and deceleration along the time axis.

31
Q

Dynamic Time Warping

A

Warp the time axis of one or both sequences for better alignment.

32
Q

Longest Common Subsequences

A

Match two sequences with potentially unmatched elements.

33
Q

Indexing

A

Given a time series T and similarity measure D(T,T’), find the most similar time series to T among a set of different time series.

34
Q

Anomaly Detection

A

Given a time series and a model of its behaviour, identify subsequences with anomalies.

35
Q

Motif Discovery

A

Find every subsequence, called motif, that appears repeatedly in a longer time series.

36
Q

Survival Analysis or Time-to-Even Analysis

A

Focuses on finding when something important will happen.

Examples:
- Churn, the number of customers who stop using a service or a company to move to another company that offers better services and prices for them.
- Tenure, the length of time that an individual remains a customer of a company.

37
Q

Survival Curve

A

This plot starts at a 100% and keeps decreasing. It’s similar in concept to the half life of an atom. Is used to predict the tenure of a customer. Survival curves may or may not reach 0%.

38
Q

Survival Function

A

S(t) = P(T > t)

39
Q

Discrete Survival Time Data

A

Time is partitioned into continuous non-overlapping time intervals specifried by the time points t_0 = 0 < t_1 < … < t_{\ell}.

The time intervals are (t_0, t_1], (t_1,t_2],…,(t_{\ell - 1}, t_{\ell}]

40
Q

Hazard for Discrete Survival Time Data

A

The probability h_i that the subject of interest will fail during (t_{i-1}, t_i], assuming survival at least until t_{i-1}:

h_i = P(t_{i-1} < T \le t_i \mid T > t_{i-1})

Hazard functions may increase and decrease over time.

41
Q

Kaplan_Meier Estimator of Survival Function

A

\hat{S}(t) = \prod_{i: t_i \le t} \left( 1 - \frac{d_i}{n_i} \right)

Kaplan-Meier survival curves decrease over time, with each curve modelling a different leaf in the survival tree.

42
Q

Survival Trees

A

A singe tree groups subjects according to their survival behaviour based on covariates.