02 Basics of Statistics 2 Flashcards
How do you test hypotheses?
build statistical models of the phenomenon of interest
simplest statistical model
mean
How do statistical models allow you to gain confidence in the alternative hypothesis?
- fits the data well
- explains a lot of variation in the scores
Most models are:
linear - based on a straight line
Types of fits for statistical models
- good fit
- moderate fit
- poor fit
interferential statistics
determines whether the alternative hypothesis is likely to be true
p-values
probability that the result is a chance finding
common threshold for confidence
95% confident that the result is genuine and not due to chance
What p-value is statistically significant?
P less than 0.05
easiest way to assess statistical models
look at the difference between the data observed and the model fitted
measures of how well the model fits the actual data
- deviance
- sum of squared erros (SS)
- variance
- standard deviation
deviance
difference between the observed data and the model of the mean
deviance =
observed score = mean value (x-bar)
disadvantages to using deviance
- some values are negative and some positive
- can cancel themselves out
sum of squared errors (SS)
square the difference between observed score and mean value
disadvantage of using SS
SS value is dependent on the amount of data collected
With more data point, SS value is
higher
variance (s2)
average error between the mean and observed scores
variance equation
SS/(n-1)
What does variance build upon?
- SS value
- takes amount of collected data into account
standard deviation (s)
square root of variance
benefit to using standard deviation
ensures that measure of average error is in the same units as the original measure
standard deviation equation
√(SS/(n-1))
What does a small s indicate?
- data points are close to the mean
- the model is a good fit
What does a large s indicate?
the mean is not an accurate representation of the data
Standard deviation provides information about
how well the mean represents the sample data
If you take several samples from the same population, the samples will ____________, so it is important to understand _______________.
- differ slightly
- how well the sample represents the population
standard error related to standard deviation
standard error is similar measure to the population as standard deviation is to the sample
sampling variation
samples from the same population will vary slightly because they contain different members of the population
sampling distribution
frequency distribution of the sample means from the population
average of the sample means =
mean of the population
standard error of the mean
standard deviation of the sample means
What does standard error of the mean measure?
variability between the means of different samples of the population
standard error
√(standard error of the mean)
central limit theorem
as samples get large, the sampling distribution has a normal distribution with a mean equal to the population mean
central limit theorem applies to
more than 30 people in a sample
Because it’s impossible to collect hundreds of samples, you must rely on
approximations of standard error
What do confidence intervals provide?
another approach to assess the accuracy of the sample mean as an estimate of the population mean
confidence interval - range of values
range (2) values within which the researchers think the population value falls
What do you need to calculate confidence intervals?
must know
- s
- x-bar
most common CIs
95%
99%
95% CI means
95% likely that the population mean falls between the two values
99% CI means
99% likely that the population mean falls between the two values
What lies at the center of the CI?
mean
small CI
sample mean must be very close to the true mean
wide CI
sample mean is not similar to the true mean and thus is a bad representation of the population
How can systematic variation be explained?
by the statistical model (IV)
Can unsystematic variation be explained by the statistical model?
- no
- not attributable to IV
test statistic
- variance explained by the model
- variance not explained by the model
examples of test statistics
t-stat
f-stat
x2 stat
larger test statistic
more unlikely it occurred by chance
larger test statistic =
- lower p-value
- more likely the test statistic is statistically significant
A hypothesis can be ________ or _________
- directional
- non-directional
directional hypothesis
one-tailed test
non-directional hypothesis
two-tailed test
The prediction of direction must be made ______
prior to collecting data
The one/two tailed test has a statistical advantage
one-tailed test
Why does the one tailed test have a statistical advantage to a two-tailed test?
- researcher needs a smaller test statistic to find significant results
- must have research to support the use of a one-tailed test
different effect sizes
- Cohen’s D
- Pearson’s correlation coefficient
- response measures (MCID most common)
Why do we need effect sizes?
A statistically significant finding does not mean that the finding is clinically useful or of a magnitude that is meaningful
When can effect sizes be calculated?
post-hoc to determine the magnitude of a statistically significant effect
power =
1-beta
How can power be useful?
- calculate sample size a priori
- calculate power of the study post-hoc
g power
free program that can be downloaded to determine sample size to achieve a desired level of power
higher test statistic = lower
p-value