Final Exam Flashcards
null hypothesis (H0)
statement that is the skeptical viewpoint of your research question
- no difference
4 steps of a hypothesis tests
- define null and alternative hypothesis
- establish null distribution
- conduct statistical test
- draw scientific conculsions
null distribution
sampling distribution we expect from sampling a statistical population when the null hypothesis is trues
alternative hypothesis (HA)
statement that is the positive viewpoint viewpoint of your research question
- everything not in null (mutually exclusive)
- there is a difference
3 factors to the hypotheses
- mutually exclusive
- they describe all possible outcomes = exhaustive
- null always includes the equality statement
non-directional hypothesis
state that there should be a difference in alternative hypothesis
directional hypothesis
state that the difference should be in a specific direction (smaller vs. larger)
statistical inference
conclusion that a set of data are unlikely to come from the null hypothesis
statistical decision
whether we believe our data came from the null distribution or not
- if its likely data came from null distribution = “fail to reject”
- if it is unlikely data came from null distribution = “reject null”
2 probabilities for null distribution
- type 1 error rate
- p-value
type 1 error rate (alpha)
probability of rejecting the null hypothesis when it is true
- set by researcher without any inference to data
p-value (p)
probability of seeing your data, or something more extreme, under the null hypothesis
- are under curve from data to more extreme values
rules of making statistical decision
- if p-value is less than type 1 error rate, then we “reject null hypothesis”
- if p-value is greater than or equal to type 1 error rate, then we “fail to reject null hypothesis”
what the scientific conclusions consider
- strength of inference: how strong evidence is
- effect size: only consider it when we reject null hypothesis (small = low impact)
error rates
probability of making a mistakes
- type I and II have an inverse relationship (when one increases the other decreases)
type II error rates
probability of failing to reject null hypothesis when it is false
- area under alternative distribution from data point to something more extreme
types of t-tests
- single-sample t-tests
- paired-sample t-tests
- two-sample t-tests
single-sample t-tests
evaluate whether mean of your sample is different from some reference value
ex. is mean test score from a sample of high school students different than national standards
paired-sample t-tests
evaluate whether mean of paired data is different from some reference value
- looks at changes in a SU
ex. does tutoring improve grade for a student
two-sample t-tests
evaluate whether mean of two groups are difference from each other (compare two groups)
ex. do dogs sleep more than cats
mean
= m
reference value
= mew - u
- it is given
the reporting of a single-sample t-test should include…
- sample mean and standard deviation
- observed t-score
- degrees of freedom
- p-value
observed t-score
calculated using sample mean, standard deviation, size and reference value
reference value for paired t-tests
typically 0
null and alternative hypotheses for paired t-tests
the statements about how difference between the paired measurements is related to reference value
scientific conclusions for paired t-tests
- if we reject null = the sample data provide strong evidence that the difference between the paired measurements is different from reference value
- if we fail to reject null = the sample data do not provide strong evidence that the difference between paired measurements is different from reference value
the reporting of a paired t-test should include…
- mean difference between paired measurements, and standard deviation of the differences
- observed t-score
- degrees of freedom
- p-value
sample means for two-sample t-test
m1=sample mean of first group
m2=sample mean of second group
- can change
scientific conclusions for two-sample t-tests
- if we reject the null = the sample data provide strong evidence that the means of the two groups are different
- if we fail to reject the null = the sample data do not provide strong evidence that the means of the two groups are different
the reporting of a two-samples t-test should include…
- mean, satndard deviation, and sample size for each group
- observed t-score
- degrees of freedom
- p-value
expected contingency table
expected frequencies under null hypothesis
1-way contingency table
one categorical variable
- is there a difference in counts among the levels of that variable?
- all counts are distributed equally
key features of 1-way contingency table
- ECT is always given as counts
- sum of all expected counts must be same as sum of all counts in observed contingency table
- ECT has fractional values
2-way expected contingency table
two categorical variables
- looking for an interaction between the variables
- counts are distributed independently among cells
ex. is age independent of year
calculating 1 way
calculate marginal distributions as proportions
- row and column sums/table total
calculating 2 way
product of row and column proportions for each cell x table total
chi-squared distributions
measure of the distance between the observed and expected contingency tables
steps to calculating chi-squared distributions
- take difference between each observed and expected cell
- square the difference
- divide by the expected value
- sum over all cells in the table
chi-squared distribution
when you sample an imaginary statistical population where the null hypothesis is true, you would get the distribution of chi-squared scores
key features of chi-squared distribution
- area under curve sums to one
- degrees of freedom determines shape of distribution - different for 1 and 2 way tables
- only positive values
chi-squared test
used with only categorical data and contains the variation we expect from sampling error rate
- always directional
statistical decision of chi-squared test
- reject the null if observed score is greater than critical score or if p-value is less than type 1 error rate (a)
- fail to reject the null is observed score is less than or equal to critical score or if p-value is greater than or equal to a
what side do p-value and type 1 error always go on in chi-squared tests
the right side
scientific conclusions for 1-way tables
- reject null and conclude theres evidence to support that the counts are not equal among cels
- fail to reject null conclude that there is not evidence to support that the counts are not equal among cells
scientific conclusions for 2-way tables
- reject null and conclude there is evidence to support that the variables are not independent of each other
- fail to reject null and conclude there is no evidence to support that the variables are not independent of each other
the reporting of a chi-squared test should include…
- short name of the test (X2)
- degrees of freedom
- total count in the observed table yes
- observed chi-squared value
- p-value
factors of a correlation test
- no implied causation between variables (one variable doesn’t cause another)
- both variables are assumed to have variation
- is not used for prediction
pearsons correlation coefficient
measures the strength of association between two numerical variables
r=measuring from sample
p=population parameter - about stats pop
correlation coefficients
p=-1 indicates a perfect negative correlation
p=0 indicates no association
p=1 indicates a perfect positive correlation
assumptions behind a correlation tests
- each pair of numerical values is measured on same sampling unit
- numerical values come from continuous numerical distributions with non-0 variation
- association = straight lines
null and alternative hypothesis for correlation tests
H0: correlation coefficient is 0
HA: correlation coefficient is not 0
directional = if there is a positive or negative association
null distribution for correlation tests
sampling distribution of correlation coefficients from a statistical population with no association between variables (p=0)
statistical decision for correlation tests
same as chi-squared tests
scientific conclusions for correlation tests
different depending on direction
- no direction = just based on association
- directional = based on positive or negative association
the reporting of a correlation tests should include…
- symbol for tests (r)
- degrees of freedom
- observed correlation value
- p-value
linear regression
used to evaluate whether changes in one numerical variable can predict changes in a second numerical variable
linear regressions in experimental studies
prediction reflects a causal relationship between the variables
- predictor variable is independent variable, and response is dependent variable
linear regressions in observational studies
choice of predictor variable depends on research question
two parameters of linear regressions
- slope (b)
- intercept (a)
slope (b)
amount that the response variable (y) increases or decreases for every unit change in the predictor variable (x)
- +ve values = increasing relationship
- 0 = no relationship
- -ve values = decreasing relationship
intercept (a)
value of response v. (y) when predictor v. (x) is at 0 (x=0)
3 components of the statistical model for linear regressions
- systemic component
- random component
- link function
systemic component of statistical model
describes mathematical function used for predictions (linear equation)
random component of statistical model
describes probability distribution for sampling error (normal distribution for response variable)
link function of statistical model
connects the systemic component to the random component
fitting the statistical model
estimate the intercept and slope that best explain the data
- done by minimizing the residual variance
residual variance (ri)
difference between observed data (Yi) point and predicted value (yi)
- sum of squares
ri=Yi-yi)
steps to calculating residual variance
- calculate residual for each data point
- take square of each residual
- sum the squared residuals across all data points
- divide by degrees of freedom (df=n-2)
intercept hypothesis
used to answer questions at x=0
- how does intercept (a) relate to a reference value (Ba)
slope hypothesis
used to answer questions about how much y changes for a unit change in x
- how does slope (b) relate to reference value (Bb)
linear regression test
same as chi-squared and correlation tests
- compare observed an critical and a and t
scientific conclusions for intercept hypothesis
- reject the null and conclude there is evidence that the predicted response v. is different from the reference (Bb) at x=0
- fail to reject the null and conclude there is no evidence that the predictor response v. is different from reference (Bb) at x=0
scientific conclusions for slope hypothesis
- reject the null and conclude there is evidence that changes in predictor v. can be used to predict changes in the response v.
- fail to reject the null and conclude there is no evidence that changes in the predictor v. can be used to predict changes in response v.
the reporting of linear regression test should include…
- symbol for parameter being tested ( a or b)
- observed parameter value
- observed t-score
- degrees of freedom
- p-value
4 main assumptions for linear regressions
- linearity
- independence
- normality
- homoscendasticity
linearity assumption of linear regressions
response v. should be well described by a linear combination of predictor v.
- relationship is assumed to be straight line
- violations of linearity look like a frowny face
independence assumption of linear regressions
residuals along the predictor v. should be independent of each other
normality assumption of linear regressions
residual variation should be normally distributed
- evaluated by looking at histograms
- if assumptions of normality is met, the histogram will look similar to reference normal distribution
- if not met, histogram will have fatter or skinnier tails
homoscendasticity assumption of linear regressions
residual v. should have same variance across the range of predictor v.
violations of independence assumption of linear agression
can occur when there is repeated sampling of SU of when there is a spatial or temporal relationship among SU
violations of normality assumption on linear regression
can occur if…
1. if stat pop has a skewed or unusual distribution
2. if your data violate the assumption of linearity
shapiro-wilks test
evaluates the null hypothesis that the residuals are normally distributed
H0: residuals are normally distributed
HA: residuals are not normally distributed
heteroscendasicity
if the residuals have little variation along some parts of predictor v. and large amounts at others
analysis of variance
used to compare variance between two groups
f-tests
evaluates difference in variance between two groups
- done using ratio of variance
ratio of variance (f-score)
asks whether ratio is different from one
- ratio is equal if both groups have tje same variance
null and alternative hypothesis for f-score
evaluate whether f-score is different from 1
H0: ratio of variances is 1
HA: ratio of variances is not 1
null distribution for f-score
sampling distribution from repeatedly sampling a stats pop where the variance was the same in both groups
- ratio of variances will never be negative
degrees of freedom for f-distribution
dfA=nA-1 for one group (group A)
dfB=nB-1 for another group (group B)
statistical decision of f-tests
same as the others
- comparing observed and critical values or p and a
reporting of a f-test should include…
- mean, standard deviation, sample sizes for each group
- observed F-score
- degrees of freedom for each group
- p-value
- yes
example of f-test
do different models of electric vehicles (categorical) differ in their operating costs (numerical)
single factor ANOVA test
used when there are more than two levels in a categorical variable
two sources of variation for ANOVA tests
- group variation
- residual variation
group variation
variation among means of categorical levels
- if means are same among groups, variation =0
- if means are different, variation = high
- mean sum of the squares
residual variation
variation among sampling units within a categorical level
- mean squared error
statistical model of anova
compares group variation to residual variation
- if group variation is same as residual, the means are not different
- if group is larger than residual, groups are different
F-TESTS
how do u calculate differences in mean
F-score = group variation divided by residual variation
- increased group variation = increased f-score
statement of means
H0=the means are same across all levels of categorical variable
HA=the means are different somewhere (different between at least two groups)
statement of F-tests
evaluate whether group variation is larger than residual variation
- null and alternative hypothesis are directional
null distribution of anova test
sampling distribution from repeatedly sampling a stats pop where the means are the same across all levels of categorical variable
- f-distribution
dfG=k-1
dfE=n-k
statistical decision for anova
same as before
- compares critical f-score to observed f-score or p to a
scientific conclusions for anova
- reject null and conclude there is evidence that at least 2 of the means are different
- fail to reject null and conclude there is no evidence that at least 2 of the means are different
reporting of an ANOVA should include…
- mean, standard deviation, and sample size for each group
- observed f-score
- degrees of freedom for group and residual variation (dfG and dfE)
- p-value
- yes
ANOVA post-hoc tests
secondary test used to identify what group means are different in an ANOVA
- only used if anova rejects null hypothesis
contrast statement of post-hoc tests
evaluate whether mean of any two groups is different
- multiple contrast statements
family of contrasts
a set of contrast statements
family-wise error rate
type I error rate for entire family of contrasts
TukeyHSD
type of post-hoc test = honest significant difference
- evaluates all possible combinations of categorical levels and compares means
- compares family-wise error rates
two-factor anova test
considers two categorical factors and focuses on the interaction between them
- also evaluates the effect of two categorical factors on a numerical variable
3 questions two-factor anova tests answer
- main effects A
- main effects B
- interactions
main effects A
differences among the levels of factor A averaging across the levels of factor B
main effects B
differences among the levels of factor B averaging across the levels of factor A
interactions question
differences among levels of one factor within each level of other factor (cell-by-cell comparison)
interaction
deviation from additivity - levels dont add up as expected
additivity
when the effects of the levels are their sample sums (adding)
- use interaction plots to visualize
interaction plot
- if the categorical v. are additive (no interaction = the lines will be parallel
- if the categorical v. are not additive (interaction) = the lines are not parallel
antagonist interaction
lines cross
synergist interaction
lines do not cross but are not parallel
statement of means for main effect A
H0=means are same across all levels of factor A
HA=means are different somewhere in at least 2 groups of factor A
statement of means for main effect B
H0=means are same across all levels of factor B
HA=means are different somewhere in at least 2 groups of factor B
statement of means for interaction
H0=deviation of each cell relative to additivity is 0
HA=has at least one cell having non-0 deviation from additivity
null distribution for main effects A
means are the same across all levels of factor A
null distribution for main effects B
means are the same across all levels of factor B
null distribution for interaction
the cell means are additive
4 sources of data variation for f-tests for two-factor ANOVA
- group variation factor A
- group variation factor B
- AB interaction
- residual variation
group variation factor A
variation among the means of the levels for factor A
- total group variation/degrees of freedom for a
group variation factor B
variation among the means of the levels for factor B
- total group variation/degrees of freedom for b
AB interaction
amount of variation attributable to deviation from additivity
- total variation of cell deviations from additivity/degrees of freedom ab
residual variation
variation among SU within a cell
- total residual variation/degrees of freedom (ab(n-1))
f-tests for main effects A
factor A group variation relative to residual variation
- MsA/MsE
f-tests for main effects B
factor B group variation relative to residual variation
- MsB/MsE
f-tests for interactions
amount of variation attributable to deviations from additivity relative to residual variation
- MsAB/MsE
statistical decision for two-way anova tests
same as before
- compare observed f-score to critical f-score or p and a
scientific conclusions for two-way anova tests (interaction)
- reject null if there is evidence that at least one cell deviates from additivity
- fail to reject null if no evidence that at least one cell deviates from additivity
scientific conclusions for two-way anova tests (main effects)
- reject null if there is evidence that means of at least two levels are different in the factor
- fail to reject null if there is no evidence that means among levels are different in the factor
- only evaluate main effects if conclusion is to reject the null