Exam 3 Flashcards

1
Q

What makes time series data different from what we have studied so far in this course?

A

This data has time or dates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the major components of a time-series Signal?

A

Level: Average value of the series
Trend: Increasing, decreasing, or static
Seasonality: Repetition in the data
Noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two models we can use to understand the major components?

A

multiplicative
additive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we choose between multiplicative and additive models?

A

When to use multiplicative model: when repetition changes over time
When to use additive model: when multiplicative model statement isn’t true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If a signal does not have seasonality, what should we expect to see in the graphs for the two models?

A

Additive = seasonality would be close to 0
Multiplicative = seasonality would be close to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which components are present in all signals, and which are not guaranteed?

A

All series have level and noise, however trend and seasonality are not guaranteed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is kmeans considered unsupervised learning?

A

We do not know which group the data belongs to before clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The steps for Kmeans clustering

A

Step 1: Randomly select K observations as initial cluster centroids (center)
* Step 2: Use a distance (similarity) metric for assigning each observation to one of K clusters
* Step 3: Recalculate cluster centroid
(center)
* Step 4: If any data points changed clusters in Step 2 AND we have not reached our max iterations, go back to Step 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Know the difference between the K in KNN and Kmeans

A

K in k-means: number of clusters
K in KNN: number of neighbors to compare for class assignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is convergence?

A

Convergence: no change in clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we understand the similarity of a datapoint to each cluster center?

A

Most similar == smallest distance
* Euclidean distance: Because we are calculating a distance, the features must be numeric for K-means clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do we use for predictors when using linear regression on
time-series data?

A

trend and seasonality as predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we use seasonality as a predictor?
* What do we need to know to create a sub-interval for one repetition

A

Size of the sub-repetitions is the period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How are the training and validation sets different for time-series Datasets?

A

assumes the relationship between time steps is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does the time step affect the linear regression models for forecasting? (give example)

A

If we use quarters to represent seasonality, we end up with 4 linear models
* If we chose to use months,we end up with 12 linear models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does linear regression differ for time-series data from traditional linear regression?
* Hint: How many lines are used to estimate the target?

A

Depends on the time your data is in and how you want to display it (months, days, etc)

17
Q

If seasonality does not exist

A

straight line graph

18
Q

what is the significance between multiplicative and additive

A

level and trend are going to be the same, you are wanting to see the difference between seasonality and noise (y-axis difference)

19
Q

which signals are always guaranteed and not

A

guarenteed = noise and level
not = trend and seasonality

20
Q

how does time models differ from linear

A

time (seasonality) is accounted and we are fitting multiple lines of data vs 1 to estimate repetition. training and validation also must be in chronological order

21
Q

KMeans is used to classify datapoints. (True or False)

A

False

22
Q

In linear regression, to determine the repetition sub-interval, we only need to know the time-step of the dataset.

A

False

23
Q

KMeans always converges to an ideal grouping.

A

False

24
Q

If linear regression is applied to a time-series dataset without seasonality, it will produce the same results as regular linear regression (ie MLR)

A

True

25
Q

Noise graphs from both the additive and multiplicative models can be used to choose which decomposition model to use.

A

True

26
Q

In K-means clustering, we group data based on an expected target.

A

False

27
Q

Ideally, in clustering, the distance between centroids is minimized.

A

False

28
Q

Your goal is to understand sales and demographic data from eight different store locations and identify the differences between a high performing stores vs a low performing stores. Which model would be best?

A

KMeans

29
Q

You run a landscaping company and have tracked the last 3 years of demand for your services. What models can you use to help predict the demand for year 4?

A

Linear Regression

30
Q

Select everything we need to know in order to use multiple linear regression with a time series signal

A

Trend
Seasonality
Time-step