psych 218 - F Flashcards
1
Q
t-test
A
- used when population sample (μ) is specified, but population standard deviation (σ) is unknown
- when using it:
- set a hypothetical population mean, assuming null is true (difference between pre and post will be 0)
- estimate population standard deviation (σ) from the sample (s) because it is our best guess
2
Q
z-test v t-test
A
- t-test uses standard deviation of the sample (s), while z-test uses it from the null population (σ)
- difference in formulas:
- t-obt: divide by standard error of the sample (σ x̄) instead of standard error of the mean (s x̄)
- standard error: use s instead of σ
- need to consider sample size when using t-test (unlike z-distribution that is always normal) at N ≥ 30
3
Q
t-test and impact of using standard deviation (s)
A
- as we’re assuming that s = σ, we will be systematically underestimating σ
- will think there is less variability than there actually is
- correct for this using degrees of freedom
- degree of freedom: # of scores that are free to vary in calculating that statistic
4
Q
t-distribution
A
- becomes closer to the normal distribution as degrees of freedom increases
- peak gets higher and higher
- gets closer to normal distribution because sample size increases (estimate of s = σ gets closer)
- t-distribution will have more extreme values than z-distribution (because there is more variability in t due to estimate of s = σ)
5
Q
reporting conclusions
A
- specific statistical language:
Ho: x̄ = μ, H1: x̄ ≠ μ - t-test:
Students at UBC check their phones reliably less than the population, t(23) = -6.86, p < 0.001, d = -1.40 - bivariate correlation:
there is a small, positive and reliable correlation between current salary and number of months of hire, r(472) = 0.08, p(1-tail) = 0.30 - confidence intervals:
Our lightbulbs had a long life, Cl-95% [213.52, 216.48] - ANOVA
a one-way ANOVA revealed a reliable difference in memory SPAN between the 3 stimulus types, F(2, 14) = 15.86, p < 0.001, η2 = 0.537
6
Q
cohen’s d effect sizes
A
- captures magnitude of the effect (part of descriptive stats)
- expressed in original units of measurement (only useful of original units are meaningful)
- allows us to compare across different studies (put on the same scale by dividing by s in the formula)
- d = 1 means sample mean is 1 above the population mean
- how to get a larger effect size?
- decrease SD by conducting a more controlled experiment
- increase strength of manipulation (to have bigger difference between x̄ and μ)
- choose what groups to investigate (to have bigger difference between x̄ and μ)
7
Q
types of t-test
A
- single sample t-test
- compares difference between x̄-obt - μ
- use Cohen’s d^
- paired samples t-test
- compares difference between x̄-pre - x̄-post
- use Cohen’s d-z
- potentially more powerful because it maximizes possibility of a high correlation between the scores
- independent samples t-test
- compares difference between x̄1 - x̄2
- use Cohen’s d-s
- use weighted standard deviation when n1 ≠ n2
- has a more efficient use of df (higher df = lower t-crit)
8
Q
results of a t-test
A
- larger t-obt value = higher likelihood that null would be rejected = a more powerful test
- how to increase t-obt?
- increase real effect of IV (will increase numerator)
- increase sample size (will decrease denominator)
- decrease variability through controlled experiments (will decrease denominator)
9
Q
assumptions of the independent t-test
A
- sampling distribution is normally distributed
- homogeneity of variance (if σ1 ≠ σ2, the 2 samples are probably not from random samples)
- t-test is robust, and thus insensitive to violations
10
Q
confidence intervals
A
- range of values that probably contains the population value (how confident are you that the range of values capture the true value)
- center of interval will be the mean of the data
- very sensitive to sample size
- shows how much the regression line can be tweaked based on the variability of the sample and sample size
- if p < 0.05, confidence interval will not allow the slope to be negative (X and Y will have positive relationship no matter what)
- we can choose our confidence levels
- larger cl = less practically useful because it will include a wider range of values to expect
- 95% confidence interval: check for t-0.025 (divide 5% by 2)
11
Q
SD v CL
A
- sd: describes variability of observations around a statistical model
- how much do observations differ from sample mean / regression line?
- i.e. variability of x around x̄
- cl: describes variability of statistical model itself
- how different might sample mean / regression line be if we collected a new sample?
- i.e. variability of x̄ around μx̄
12
Q
F-test (ANOVA)
A
- use for any experiment when comparing more than 2+ groups of IV
- one-way ANOVA: one IV with 3 different levels
- can make 1 overall comparison to find significant difference between means of the 3 groups
- 500 mg pill, 1000 mg pill, placebo pill
- factorial ANOVA: experiment with multiple IVs
- must be fully factorial (= have all the conditions)
- dose (500mg, 1000mg, placebo) and schedule (x1, x2)
13
Q
ANOVA assumptions
A
- sampled populations are normally distributed
- DV is interval or ratio
- homogeneity of variance
*but ANOVA is generally robust to these assumptions
14
Q
limitations of ANOVA
A
- cannot say any of these means are higher from the other means
- can only say:
- [1] all means differ from each other
- [2] there is a difference from at least 2 means that is different
- can only say:
- conclusions are also non-directional (not satisfying)
15
Q
why ANOVA instead of several t-tests
A
- would need to do 3 tests (A & B, B & C, A & C)
- as # of test increases, α increases
- added probability of making type I error would increase
- doing α/3 will be too conservative (> will lose statistical power)
- beta will increase greatly > won’t detect real effect when there is one