WINTER BREAK REVIEW VOCAB Flashcards

1
Q

What points have leverage in regression?

A

Those far to the left and right from x-bar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What points are outliers in regression?

A

Those that don’t follow the flow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What points have influence in regression?

A

Those that would change the slope if removed (they are outliers that have leverage)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Interpret r^2 ?

A

The percent of variablility in Y explained by the model with X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Interpret y-intercept?

A

When X=0, the model predicts this much Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Interpret SLOPE

A

For every 1 unit of x, there is a change of SLOPE units of y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Find P( Z > 1.5) ?

A

normcdf( 1.5, 9999)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe independence and association with quantitative examples.

A

Height and IQ are independent. Height and weight are associated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe independence and association with categorical examples.

A

Grade and pizzsa preference are independent, gender and gaming status are associated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

function to find area under normal curve?

A

normcdf

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

function to find a percentile in normal model?

A

INVNORM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the measures of spread we use?

A

standard deviation, variance, range, interquartile range, standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the measures of center we use?

A

mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you describe the distribution of a single data set? (a histogram)?

A

SHAPE (#modes, skewness), CENTER (measure of center), SPREAD (measure of spread), STRANGE (outliers or gaps)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you describe an association between two quantitative variables? (scatter plot)

A

DIRECTION (pos/neg) FORM (linear,curved) STRENGTH (strong, moderate, report “r” value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does rSy/Sx mean?

A

slope formula. For each SD in X, you go r SD in y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does SD of residuals tell us?q

A

Average distance to the model. About how far off we expect model to be.

18
Q

What graphs for QUANTITATIVE data?

A

histogram, box/whisker, stemplot, dot plot, ogive, time plot, line graph

19
Q

What graphs for CATEGORICAL data?

A

segmented bar, bar, pie, mosaic

20
Q

Diff between standard deviation and standard error?

A

Standard deviation is typical distance to mean for a data point, Standard error is typical distance to parameter for a statistic in a sampling distribution.

21
Q

What is variance?

A

A measure of spread- the average squared distance to the mean. SD^2

22
Q

What is a Z score?

A

the number of SD a data value is away from the mean

23
Q

What is a test statistic?

A

The number of SE a statistic is away from the hypothesized parameter.

24
Q

What is formula for nCr ?

A

n! / r! (n-r) !

25
Q

What is margin of error?

A

Distance you reach up and down when making CI. It is CRIT * SE

26
Q

What is error?

A

Distance from a statistic to the parameter. How far off your stat is from the truth.

27
Q

What is a confidence interval?

A

A parameter catcher. It tries to catch the truth.

28
Q

What does “95% confident” mean?

A

If you took 100 samples and made 100 confidence intervals, about 95 would contain the parameter and about 5 would not.

29
Q

What is alpha?

A

It is the rejection threshold. Reject Ho when p-value is below alpha.

30
Q

What is a p-value?

A

The likelihood you obtained your statistic or one more extreme due to just chance if the Null was actually true.

31
Q

Suppose p value = 0.003. How would you interpret?

A

With a p-value this low (0.003 < 0.05), I reject the Ho, there is enough evidence to say [Ha in context]

32
Q

What are the the sample size requirements for inference for both means and proportions?

A
  1. You need a random sample. 2. (not too big) Less than 10% of population 10n30. For props, np>10 and nq>10.
33
Q

Minimum sample size for means?

A

If population is normalish, then there is no minimum sample size. If it is skewed or bimodal or any other non-normal distribution, then n>30.

34
Q

Minimum sample size for proportions?

A

You need at lease 10 successes, np>10, and 10 failures, nq > 10

35
Q

What is the golden sentence?

A

I was curious about a population paramter, but a census was too costly so instead I took a sample and used the data to calculate a statistic and then made an inference about the parameter with that statistic.

36
Q

What is probability?

A

Long run relative frequency. (the long run percent)

37
Q

What is the Law of Large Numbers?

A

In the long run, after many many trials, the % of successes approaches the true probability. Think: if you flip a coin twice, you may get 0% heads, 50% heads or 100% heads. If you flip 10,000 times, you probably will have about 50% heads (def not 0 or 100)

38
Q

What is a critical value?

A

1 for 68% confidence, 2 for 95 and 3 for 99.7. It is the number of SE you want to reach out in a confidence interval.

39
Q

Where are outliers located in a data set ?

A

outside the fences. Lower Q1-1.5IQR and upper Q3+1.5IQR

40
Q

What is a sampling distribution?

A

A pile of statistics taken from many many many samples

41
Q

What are the two sampling distributions we have discussed?

A

MEANS: N ( mu, sigma/root n) and PROPORTIONS: N ( p, root (pq/n) )

42
Q

When we combine random variables, what do we add?

A

Add means and add variances. DO NOT ADD ST DEV. You add variances and take the square root of the sum to find combined SD.