02 Basics of Statistics 1 Flashcards
systematic variation
variation due to the experimenter doing something to all the participants in one condition and not in the other condition
variation due to an intervention
unsystematic variation
variation that results from random factors that exist between the experimental conditions
natural differences in ability, time of day, motivation, IQ
Statistics discovers _____________ and then determines __________
- how much variation exists in performance
- how much is systematic and how much is unsystematic variation
normal distribution
all data is distributed symmetrically around the center of all scores
bell curve
deviations from normality (2)
- skew
- kurtosis
skew
lack of symmetry
positively skewed
tail points toward higher or more positive scores
negatively skewed
tail points toward lower or more negative scores
kurtosis
degree to which scores cluster at the ends of the distribution (tails) and how pointy a distribution is
positive kurtosis
many scores in the tails and is pointy
negative kurtosis
few scores in the tails and tends to be flatter than normal
normal distribution means the values of skew and kurtosis are
0
measures of central tendency
- mean
- median
- mode
mode
score in the data set which occurs most frequently
median
middle score when the data is ranked in order of magnitude
mean
average score of data set
Which levels of data use mode?
- nominal
- ordinal
Is median affected by extreme scores, skewness, or kurtosis?
no
Which levels of data use median?
- ordinal
- interval
- ratio
µ =
average of a population
x-bar
average of a sample
disadvantages of using mean
- influenced by extreme scores
- affected by skewness and kurtosis
Which levels of data use mean?
- interval
- ratio
advantages of using mean
- uses every score in the data set
- stable in different samples
methods of quantifying dispersion in a data set
- range
- inter-quartile range
range
largest score » smallest score
disadvantage to using range
dramatically affected by extreme scores
inter-quartile range
range which excludes values at the extremes of the distribution
disadvantage to using inter-quartile range
lose a lot of the data by excluding extremes of the distribution
quartiles
3 scores that split the sorted data into 4 equal parts
1st step to determining quartiles
determine median (2nd quartile)
2nd step to determining quartiles
determine median of each half of the data set
3rd step to determining quartiles
calculate interquartile range
lower quartile
median of the lower half of the data
upper quartile
median of the upper half of the data set
What is the interquartile range?
upper quartile - lower quartile
What does probability distribution allow for?
calculate the probability of getting particular scores based on frequency it occurs in a distribution with the common shapes
Which probability distributions have been calculated by statisticians?
normal distribution with
- mean of 0
- standard deviation of 1
Which data sets can be converted into a set with a mean of 0 and SD of 1?
any data set
Where can you use z-scores?
only with normally distributed data
z-scores
allow researchers to calculate the probability that a score will occur when the data is normally distributed
95% of z-scores are between
-1.96 and 1.96
99% of z-scores are between
-2.58 and 2.58
99.9% of z-scores are between
-3.29 and 3.29