Main Vocab Flashcards

1
Q

when describing a distribution of data, mention..

A
  1. shape
  2. center
  3. variability (spread)
  4. unusual features (outliers or gaps)
    Always include context!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A five-number summary includes..

A

Minimum, Q1, median, Q3, maximum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

a z-score tells us..

A

The number of standard deviations above or below the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

the empirical rule

A

in a normal distribution:
about 68% of the values lie within 1 standard deviation of the mean
about 95% of the values lie within 2 standard deviations of the mean
about 99.7% of the values lie within 3 standard deviations of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when describing a bivariate distribution (scatterplot), mention..

A
  1. direction (positive or negative)
  2. strength (strong or weak)
  3. form (linear or non-linear)
  4. unusual features
    Always include context!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

residual

A

Observed value - predicted value. y − ŷ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When asked to interpret the slope of a regression line in context, say..

A

“On average, there is a predicted [increase or decrease] of [slope][units] in [the dependent variable] for every
increase of one [unit] in [the independent variable].”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When asked to interpret the y-intercept of a regression line in context, say..

A

“On average, when the value of [the independent variable] is zero [units], [the dependent variable] is predicted
to be [y-intercept] [units].”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When asked to interpret the coefficient of determination of a regression line, r2, in context, say..

A

“[r^2] percent of the variation in [the dependent variable] can be explained by the linear relationship between
[the dependent variable] and [the independent variable].”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When writing about bias in sampling methods..

A
  1. Identify the population and sample.
  2. Explain how the sampled individuals differ from the general population
  3. Explain how this leads to an overestimate or underestimate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Two events A and B are independent if

A

P(A|B) = P(A) and P(B|A) = P(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Two events A and B are mutually exclusive if

A

P(A ∩ B) = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The probability of the union of two events A and B can be found by

A

P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The probability of the intersection or “joint probability” of two events A and B can be found by

A

P(A ∩ B) = P(A) ∙ P(B|A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Conditions for a one-sample z interval for a population proportion

A
  1. the data are collected using a random sample
  2. when sampling without replacement, the sample size is less than 10% of the population
  3. np̂≥ 10 and n(1 − p̂) ≥ 10
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Margin of error

A

(critical value)(standard error)

17
Q

When asked to interpret a confidence interval, say..

A

“We are [confidence level] confident that the interval from ___ to ___ captures the true [parameter in context].”

18
Q

When asked to interpret a confidence level, say..

A

“If we take random samples of size [n] from the population of [population in context] and use each sample to construct a
[confidence level] confidence interval, about[confidence level] of those intervals would capture the true
[parameter in context].”

19
Q

Conditions for a one-sample z test for a population proportion

A
  1. the data are collected using a random sample
  2. when sampling without replacement, the sample size is less than 10% of the population
  3. np0 ≥ 10 and n(1 − p0) ≥ 10
20
Q

When concluding a hypothesis test, say..

A

Because the p-value of ____ ≤ α = _____ , we reject H0. There is convincing statistical evidence that [Ha in context]
OR
Because the p-value of ____ ≥ α = _____ , we fail to reject H0. There is not convincing statistical evidence that
[Ha in context]

21
Q

Conditions for a two-sample z interval for a difference in population proportions

A
  1. the data are collected using random samples or random assignment to treatment groups
  2. when sampling without replacement, the sample sizes are both less than 10% of the population
  3. n1p̂1 ≥ 10
    n1 (1 − p̂1) ≥ 10
    n2p̂2 ≥ 10
    n2 (1 − p̂2) ≥ 10
22
Q

power

A

the probability of not making a Type II error

23
Q

Conditions for a two-sample z test for a difference in population proportions

A
  1. the data are collected using random samples or random assignment to treatment groups
  2. when sampling without replacement, the sample sizes are both less than 10% of the population
  3. n1 (pc) ≥ 10
    n1(1 − pc) ≥ 10
    n2(pc) ≥ 10
    n2(1 − pc) ≥ 10
24
Q

Conditions for inferencing with a population mean

A
  1. the data are collected using random samples or random assignment to treatment groups
  2. when sampling without replacement, the sample size is less than 10% of the population
  3. n ≥ 30
    OR
    If n < 30, the sample data is free of skew or outliers
25
Q

Conditions for a two-sample t interval for a difference in population mean

A
  1. the data are collected using random samples or random assignment to treatment groups
  2. when sampling without replacement, the sample sizes are less than 10% of the population
  3. n1 ≥ 30 and n2 ≥ 30
    OR
    If n < 30, the sample data is free of skew or outliers
26
Q

Conditions for a Chi-Square test

A
  1. the data are collected using random samples or random assignment to treatment groups
  2. all expected counts are greater than or equal to 5
  3. trials are independent