TT2 Flashcards
hypothesis test steps
- state hypothesis and select alpha level
- locate crit. region boundaries (t or z value)
- collect data and calculate sample stats (t ot z score)
- make decision based on criteria (is it in the crit region? reject or retain?)
characteristics of a distribution of sample means
- normal if variable is normal OR n>30
- the larger the sample size, the closer the sample means should be to the population mean, therefore lower n = more widely scattered (larger variance)
standard deviation for a distribution of sample means
standard error of M
how much distance to expect between a sample mean and the population mean
σ sub M
= σ/square root of n
mean for a distribution of sample means
expected value of m
= population mean
law of large numbers
as n increases, the error between the sample mean and the population mean should decrease
this is bc as n increases, samples should be more accurate to the population, reducing variance and therefore the standard error of M
when is standard error of M identical to standard deviation
when n = 1
bc when n = 1, the distribution of sample means is the same as just the distribution of scores
what is the “starting point” for standard error?
standard deviation, bc standard error = SD when n = 1, as n increases standard error decreases from there
central limit theorem
- law of large numbers
- standard error = SD when n = 1
the standard error can be viewed as a measure of the ____ of a sample mean
reliability
If the standard error is small, all the possible sample means are clustered close together and a researcher can be confident that any individual sample mean will provide a reliable measure of the population.
the expected value of M (when n = 100) will be ____ the expected value of M (when n = 25), because
equal to
they should both be equal to the population mean
the standard error of M (when n = 100) will be ____ the standard eror of M (when n = 25) because of ___
less than
the law of large numbers
random sampling criteria/assumptions for a z test
sampling with replacement, selections must be independent (each selection is not influenced by the last, gambler fallacy)
type 1 error
reject the null hypothesis when in fact the treatment has no effect
probability of type 1 error
alpha level
ex. if 0.05, there is a 5% that the sample is extreme by chance and therefore a 5% chance of a type 1 error
type 2 error
retains/fails to reject the null hypothesis, when in fact there is a treatment effect
level of confidence
chance that we will correctly retain the null aka say there isnt an effect when there isnt
= 1- alpha
if alpha is 0.05, there is a 5% chance of a type 1 error and 95% chance there isnt
chance of a type 2 error
function represented by beta
When does a researcher risk a Type I error?
when null is rejected
When does a researcher risk a Type 2 error?
when null is retained
In general, increasing the variability of the scores produces a larger ___ and a z score ____.
standard error
closer to 0
the ____ the variability, the lower the likelihood of finding a significant treatment effect.
larger
increasing the number of scores in the sample produces a ___ standard error and a ___ value for the z-score
smaller
larger
the ___ the sample size is, the greater the likelihood of finding a significant treatment effect.
larger
selections are not independent when…
ppts were sourced from the same place and are more likely to have similar responses
and if sampling was done without replacement and each person had a higher likelihood of being picked than the last
does the hypothesis use M or μ?
μ because we are making predictions on the population
statistical hypotheses for positive directional hypothesis
Null: μ ≤ (μ value)
Alt: μ > (μ value)
z score boundary for alpha level 0.05 for one tailed test vs two tailed test
1.65 vs 1.96
APA description
mean and sd after each group, test value (z or t(DF)), p value, one vs two tailed
statistical hypotheses for non directional hypothesis
Null: μ = (μ value)
Alt: μ ≠ (μ value)
d = 1 means…
the treatment changed the mean by a full standard deviation
evaluating effect size of cohens d
0.2 - small effect (0.2 of an SD)
0.5 - medium effect
0.8 - large effect
The power of a test
the probability that the test will correctly reject the false null hypothesis if the treatment really has an effect aka the test will identify a treatment effect if one really exists
= 1 - beta
as effect size increases, ___ increases
power
___ sample produces greater power of a test
larger
____ alpha level reduces the power of a test
reducing
___ tailed test increases the power of a test
one
type 1 error is ___ ___ null hypothesis
rejecting, true
type 2 error is ___ ___ null hypothesis
failing to reject, false
use t test when…
population SD/variance is unknown
estimated standard error
used in t tests when population SD/variance is unknown, uses sample SD/variance instead
unbiased stat
s sub m = s/square root of n = square root of (s^2/n)
z statistic vs t statistic
z score formula with standard error (σ sub M) or estimated standard error (s sub M) instead of SD
In general, the ____ the sample size (n) is, the ____the degrees of freedom are, and the ____the t distribution approximates the normal distribution.
greater
larger
better
the t distribution has more ___than a normal z distribution (distribution of sample means), especially when df values are ___. Because….
variability
small
t scores are more variable bc the sample variances changes for each sample while the population variance doesnt, this effect lessens with larger sample sizes
SS to variance
SS/n-1
steps to t test
- SS/n-1 = s^2
- square root of (s^2/n) = s sub M
- (M - mu)/s sub M = t
for t tests, large variance means that you are ____ to obtain a significant treatment effect
less likely
large samples tend to produce ___ t statistics
bigger
r^2
percentage of variance accounted for by the treatment
r^2 interpretation
0.01 small effect
0.09 medium effect
0.25 large effect
confidence interval
interval around the sample mean in which the population mean likely resides
___ sample size leads to smaller confidence intervals
bigger
larger sample sizes lead to ___ cohens d and r^2 values
the same
A researcher rejects the null hypothesis with a regular two-tailed test using . If the researcher used a directional (one-tailed) test with the same data, then what decision would be made?
Definitely reject the null hypothesis if the treatment effect is in the predicted direction.
estimated value of d
cohens d with sample SD instead of population SD
unbiased
As df increases, the shape of the t distribution ____ a normal distribution.
approaches
hypotheses for independant
Null: μ1 - μ2 = 0
Alt: μ1 - μ2 ≠ 0
For the independent-measures t formula, the standard error measures the amount of error that is expected when …
you use a sample mean difference to represent a population mean difference. When the null hypothesis is true, however, the population mean difference is zero. In this case, the standard error is measuring how far, on average, the sample mean difference is from zero. However, measuring how far it is from zero is the same as measuring how big it is.
the standard error for the sample mean difference
s sub dif
It measures the standard, or average size of m1 -m2 if the null hypothesis is true. That is, it measures how much difference is reasonable to expect between the two sample means.
= square root of SE1 + SE2
- biased if sample sizes are different
if two samples are exactly the same size, the pooled variance is simply the ___ of the two sample variances.
average
steps for independent samples t test
- find crit region based on POOLED df
- pooled variance (USE DF)
- estimated standard error (use n)
- calc t value
assumption of homogeneity of variance
for an independent samples t test, the two samples being compared must have the same theoretical population variance
find cohens d for independent samples t test
- pooled variance (USE DF)
- sqaure root pooled variance for SE
- put in formula
s^2 sub p
pooled variance - weighted mean of sample variances
s sub dif
estimated standard error for independent sample test
s sub p
pooled SD
use for cohens d
why is repeated better than individual?
individual differences - ind has more variance bc of difference and so harder to see a treatment effect, cost of more participants
related sample t-test steps
SS, s^2, Smd, T
d=?
n
x/2
related null hypothesis
directional and nondirectional
mew d = 0
mew d greater than or equal to 0
related samples assumptions
- observations within a group must be independent
- distribution of d scores must be normal
cons of repeated measures design
other factors like time may affect scores, practice effects/order effects
solution: counterbalance order
for repeated measures, null hypothesis assumes…
that mean population difference is 0
for anova, null and alt. hypothesis
mew condition 1 = mew condition 2 = mew condition 3
At least one of the treatment means is different
denominator of f ratio is called
the error term bc it represents the random unsystematic errors you can expect is null is true
k
number of levels of the factor/treatment groups
n in ANOVA
number of scores in each treatment group
N in ANOVA
number of total scores in the study
= kn
T
Treatment total;
sum of all the scores in a treatment group
= sum of X for sample 1
G
Grand total
sum of all the scores across all treatments
= sum of X for all scores
= sum of T
what to put on an ANOVA summary table
SS, df, and MS for bw, within, and total
also F
ANOVA assumptions
observations within groups must be independant, populations from which samples are selected must be normal, homogeneity of variance of populations
what happens to pearson r if constant is added or pos constant is multiplied? what is neg constant is multiplied?
nothing
sign flips
uses of correlation
predications, test validity (compare against another feature), test reliability (compare two scores at different times, they should have a strong pos correlation), theory verification
r^2 in correlation
coefficient of determination
% of the variability in the Y scores can be predicted from the relationship with X.
ie r^2= 0.36, 36% of the variance in GPA can be explained by IQ.
same cut offs as regular r ^2
correlation null and alt hypotheses
p = 0 there is no population correlation
p doesn’t not = 0 there is a population correlation
or directional
r vs p
sample correlation vs population correlation
df for correlation
n-2 bc 2 points always make a perf correlation
spearman correlation
when x and y are ordinal or when you’re looking for consistency in a not linear relationship
if you want to measure the consistency of a relationship for a set of scores, you can simply
convert the scores to ranks and then use the Pearson correlation formula to measure the linear relationship for the ranked data
point biserial correlation
used when one variable is dictonomous (only has two values)