Chapter 6 - Statistical Data Flashcards

Question 1

Q

We collect data from ___________ and use them to ____________.

Answer

A

real world experiences, draw conclusions

Question 2

Q

We choose a ___________ to study, collect measurements from a small representative subset or ________ of that population, then apply our findings back to the larger population to assess if they ____.

Answer

A

population, sample, and fit.

Question 3

Q

What is a simple model?

Answer

A

An average or a mean

Question 4

Q

What is one way to calculate a simple model?

Answer

A

Calculate an average, then look at the difference between each data point and the average value.

Question 5

Q

How can we observe how well our data fits the model?

Answer

A

Take the sum of the squared differences (Standard Deviation)

Question 6

Q

How do you quantify model accuracy?

Answer

A

Model + Error

Question 7

Q

What does the frequency distribution indicate in the deviation around the average? There are two main take-aways

Answer

A

Flat Distribution = More deviation
Skewed Distribution = a few outliers pulling the average up or down

Question 8

Q

What does the Standard Deviation measure?

Answer

A

The amount of variation among the individuals you sampled.

Question 9

Q

How does the Standard Deviation measure variation among variables?

Answer

A

By comparing the values / measurements of each value to the mean of all values.

Question 10

Q

How do you retrieve the Standard Error?

Answer

A

Take a bunch of subsamples
Calculate the mean of each subsamples
Calculate the standard deviation of the subsample means

Question 11

Q

What is a Confidence Interval?

Answer

A

Calculating the top and bottom levels within which a measurement will fall 95% of the time.

Question 12

Q

What do we use to find the Confidence Intervals?

Answer

A

A formula to calculate the range of values in which 95% of the actual measurements occur.

Question 13

Q

What do Confidence Intervals indicate?

Answer

A

Being 95% sure that the true population means that 95% of your population will fall within that range.

Question 14

Q

What is the equation of the simplest model?

Answer

A

Linear Model: y = mx + b

Question 15

Q

True or False: Not all relationships can be modelled well using a straight line.

Question 16

Q

What are the stages of constructing a Model?

Answer

A

Stating a Problem
Developing a Hypothesis about a population
Make a prediction
Collect data from sample
Fit a model to the data points
Test how well model represents

Question 17

Q

What does it mean if a model does not explain variation (relationships) between data points very well?

Answer

A

We have less confidence bin that model to predict what’s going on in the broader population.

Question 18

Q

What are two ways to compare means?

Answer

A

Independent Sample T-Tests
Analysis of Variance (ANOVA)

Question 19

Q

What does ANOVA stand for?

Answer

A

Analysis of Variance

Question 20

Q

Which method of comparing means takes “two groups that are independent of each other”?

Answer

A

Independent Sample T-Test

Question 21

Q

Which method of comparing means takes “more than two groups”?

Answer

A

Analysis of Variance

Question 22

Q

When comparing means, how can you determine whether the difference between the means is “real”?

Answer

A

By analyzing the p-value.

Question 23

Q

What does the p-value indicate in comparing differences in means?

Answer

A

It incorporates
- the magnitude of difference in means
- the sample size within each group, and
- the variation of values in each group to make its judgement

Question 24

Q

The __________ tells you whether the difference between groups is probably just a function of random chance or whether it’s something that likely holds true throughout the population.

Question 25

Q

How can you test whether two numeric variables are significantly related to one another?

Answer

A

By using a correlation coefficient or linear regression analysis.