statistics Flashcards
mean weakness
one rogue score (large or small) can heavily influence it
mean strength
the most powerful measure of central tendency as it uses all of the data
mode strength
the best measurement if you want to know how often things occur
mode weakness
sometimes a data set does not have a common value and sometimes it has a lot
median strength
not influenced by extreme scores
median weakness
not good with using small data sets
standard deviation strength
uses very value in the Data set,
not heavily distorted by extreme values and is the most sensitive
standard deviation weakness
the most difficult of the measures of dispersion to calculate
range strength
takes all of the data into account and is simple to calculate
range weakness
if either of the 2 scores are extreme, this will be distorted. it tell us little about how spread out or clustered together the data are
how to work out median
the middle number after ordering from smallest to biggest
what is standard deviation
the spread of results around the mean (- a measure of dispersion)
what does it mean if the standard deviation is more than the mean
its more varied
inconsistent
what does it mean if the standard deviation is less than the mean
less varied
more consistent
bar chart
the height of each bar represents the frequency
suitable for non-continuous data - space between bar = lack of continuity
use with categories
what level of measurement is mean
interval
universal - equal units
what level of measurement is median
ordinal
ranked - not equal units
what level of measurement is mode
nominal
categories
line graph
continuous data on x-axis
histogram
continuous data
cannot draw this type of graph if data is in categories
vertical axis = frequency - starts at 0
no gaps between bars
scattergram
represents data collected from correlations (naturally occurring)
doesn’t matter what axis they go on
negative skew
mean is lower than median and mode
normal distribution
bell-shaped curve
mean, median and code are all in the exact mid point
positive skew
mean is higher than median and mode
most of data on left ()
co-variables
show a naturally occurring relationship (not manipulated)
variable
manipulated
correlation coefficient
the strength of the relationship
-1 = perfect
-0.5 = moderate
0 = weak
+0.5 = moderate
+1 = perfect
negative correlation
one increases and the other decreases
positive correlation
both increase
hypothesis for a correlation guide
there will be positive/ negative relationship between …..
correlations v experiments - manipulation
experiment - researcher manipulates IV and DV
correlation - cannot manipulate as the variables are naturally occurring
correlations v experiments - EVs
experiment - control EVs
correlations - not controlled and so a third untestable variables may be causing the relationship between the 2 variables
correlation strength
P = relatively economical
E = unlike a lab study, there is no need for a controlled environment and can use secondary data
E =so correlations are less time-consuming than experiments
correlation weakness
P = no cause and effect
E = correlations are often presented as casual when they only show how 2 variables are related
E = this leads to false conclusions about causes of behaviour
inferential statistics
used to determine the likelihood that an ‘observed effect’ is due to chance
what does it mean when we refer to chance
has something other than the independent variable effected our results
one tailed tests
One tailed hypothesis is a directional hypothesis as it predicts the direction
In a correlation - the words positive and negative indicate the hypothesis is one tailed
If the results go in the opposite direction to that predicted, the research has to be abandoned and a new hypothesis proposed
two tailed test
Predict an effect but doesn’t state the direction
is employed
5% significance is employed then there is double the probability that the differences could occur by chance
type 1 error
when there has been an incorrect interpretation of results
A ‘false positive’ - as a difference/correlation is found when it doesnt actually exist
With this type, you reject the H0 and accept the H1 when actually the H0 is true
if the level of significance is too lenient
type 2 error
Level of significance level is too strict
‘false negative’
accept H0, reject H1 but in reality the H1 is true
3 steps to choose test
- hypothesis: difference or association
- type of experimental design:
related - repeated measures or matched pairs
unrelated - independent groups - type of measurement used:
nominal = categories
ordinal = ranked
interval = universal units
the 3 parametric tests
related t test
unrelated t test
Pearsons r
3 criteria for choosing a parametric test
- data must be interval
- distribution must be normal or data must be drawn from population that’s expected to show normal distribution
- variances should be homogenous - similar in each condition
parametric tests - does it have normal distribution
do scores cluster around the mean?
calculate mean, median, mode - if similar = normal distribution
plot data on frequency distribution bar graph - does it show normal bell curve
parametric tests - does the data have homogenous variances
deviation of scores is similar between conditions
related design - there should be homogeneity variance as the same people/ similar are tested
unrelated design - spread of scores may be different - if theres not homogeneity of variance then a parametric test shouldn’t be used