Chapter 6 - Statistical Data Flashcards

1
Q

We collect data from ___________ and use them to ____________.

A

real world experiences, draw conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

We choose a ___________ to study, collect measurements from a small representative subset or ________ of that population, then apply our findings back to the larger population to assess if they ____.

A

population, sample, and fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a simple model?

A

An average or a mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is one way to calculate a simple model?

A

Calculate an average, then look at the difference between each data point and the average value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can we observe how well our data fits the model?

A

Take the sum of the squared differences (Standard Deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you quantify model accuracy?

A

Model + Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the frequency distribution indicate in the deviation around the average? There are two main take-aways

A
  • Flat Distribution = More deviation
  • Skewed Distribution = a few outliers pulling the average up or down
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the Standard Deviation measure?

A

The amount of variation among the individuals you sampled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does the Standard Deviation measure variation among variables?

A

By comparing the values / measurements of each value to the mean of all values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you retrieve the Standard Error?

A
  1. Take a bunch of subsamples
  2. Calculate the mean of each subsamples
  3. Calculate the standard deviation of the subsample means
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Confidence Interval?

A

Calculating the top and bottom levels within which a measurement will fall 95% of the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do we use to find the Confidence Intervals?

A

A formula to calculate the range of values in which 95% of the actual measurements occur.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do Confidence Intervals indicate?

A

Being 95% sure that the true population means that 95% of your population will fall within that range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the equation of the simplest model?

A

Linear Model: y = mx + b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

True or False: Not all relationships can be modelled well using a straight line.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the stages of constructing a Model?

A
  1. Stating a Problem
  2. Developing a Hypothesis about a population
  3. Make a prediction
  4. Collect data from sample
  5. Fit a model to the data points
  6. Test how well model represents
17
Q

What does it mean if a model does not explain variation (relationships) between data points very well?

A

We have less confidence bin that model to predict what’s going on in the broader population.

18
Q

What are two ways to compare means?

A
  1. Independent Sample T-Tests
  2. Analysis of Variance (ANOVA)
19
Q

What does ANOVA stand for?

A

Analysis of Variance

20
Q

Which method of comparing means takes “two groups that are independent of each other”?

A

Independent Sample T-Test

21
Q

Which method of comparing means takes “more than two groups”?

A

Analysis of Variance

22
Q

When comparing means, how can you determine whether the difference between the means is “real”?

A

By analyzing the p-value.

23
Q

What does the p-value indicate in comparing differences in means?

A

It incorporates
- the magnitude of difference in means
- the sample size within each group, and
- the variation of values in each group to make its judgement

24
Q

The __________ tells you whether the difference between groups is probably just a function of random chance or whether it’s something that likely holds true throughout the population.

A

p-value

25
Q

How can you test whether two numeric variables are significantly related to one another?

A

By using a correlation coefficient or linear regression analysis.