Exam 3 Flashcards

1
Q

What makes time series data different from what we have studied so far in this course?

A

This data has time or dates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the major components of a time-series Signal?

A

Level: Average value of the series
Trend: Increasing, decreasing, or static
Seasonality: Repetition in the data
Noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two models we can use to understand the major components?

A

multiplicative
additive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we choose between multiplicative and additive models?

A

When to use multiplicative model: when repetition changes over time
When to use additive model: when multiplicative model statement isn’t true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If a signal does not have seasonality, what should we expect to see in the graphs for the two models?

A

Additive = seasonality would be close to 0
Multiplicative = seasonality would be close to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which components are present in all signals, and which are not guaranteed?

A

All series have level and noise, however trend and seasonality are not guaranteed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is kmeans considered unsupervised learning?

A

We do not know which group the data belongs to before clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The steps for Kmeans clustering

A

Step 1: Randomly select K observations as initial cluster centroids (center)
* Step 2: Use a distance (similarity) metric for assigning each observation to one of K clusters
* Step 3: Recalculate cluster centroid
(center)
* Step 4: If any data points changed clusters in Step 2 AND we have not reached our max iterations, go back to Step 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Know the difference between the K in KNN and Kmeans

A

K in k-means: number of clusters
K in KNN: number of neighbors to compare for class assignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is convergence?

A

Convergence: no change in clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we understand the similarity of a datapoint to each cluster center?

A

Most similar == smallest distance
* Euclidean distance: Because we are calculating a distance, the features must be numeric for K-means clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do we use for predictors when using linear regression on
time-series data?

A

trend and seasonality as predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we use seasonality as a predictor?
* What do we need to know to create a sub-interval for one repetition

A

Size of the sub-repetitions is the period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How are the training and validation sets different for time-series Datasets?

A

assumes the relationship between time steps is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does the time step affect the linear regression models for forecasting? (give example)

A

If we use quarters to represent seasonality, we end up with 4 linear models
* If we chose to use months,we end up with 12 linear models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does linear regression differ for time-series data from traditional linear regression?
* Hint: How many lines are used to estimate the target?

A

Depends on the time your data is in and how you want to display it (months, days, etc)

17
Q

If seasonality does not exist

A

straight line graph

18
Q

what is the significance between multiplicative and additive

A

level and trend are going to be the same, you are wanting to see the difference between seasonality and noise (y-axis difference)

19
Q

which signals are always guaranteed and not

A

guarenteed = noise and level
not = trend and seasonality

20
Q

how does time models differ from linear

A

time (seasonality) is accounted and we are fitting multiple lines of data vs 1 to estimate repetition. training and validation also must be in chronological order