analysis and distributions Flashcards
what is a scatterplot?
display relations between two quantative variables
how to know if a relationship is strong or weak
- strong = points lie close to the line
- weak = points are widely scattered
what does it mean when variables are correlated?
variables are related
what is the purpose of a correlation analysis? (3)
determine
- if linear relationship
- direction of relationship
- strength of relationship
what are correlation coefficients
number between 1 and -1
tells us strength and direction of relationship
what does a positive correlation coefficent mean?
positive relationship
positive correlation
what does a negative corellation coeeficient mean?
negative relationship
negative correlation
what is the difference between linear and non linear relationship?
linear = measuring correlations okay
non linear = cant measure correlations
what are the two types of coefficient
- pearson r
- spearman r
what is pearson r coefficient?
- calculated from raw scores
- suitable for interval or ratio data
- highly affected by outliers
- not suitable for skewed data
what is spearman r coefficient?
- calculated from ranking raw scores
- suitable for ordinal data
- marginally affected by outliers
- suitable for skewed data
what is a density curve?
histogram distribution of scores of ppt
useful when lots of ppt
what do density curves show?
the overall pattern of a distribution
density curve + median
point that divides area into two equal parts
density curve + quartiles
point that divide area under curve into quarters
density curve + mode
positions at the peak of the curve
density curve + mean
balancing point of the curve
what does symmetrical density curve mean?
mean = median = mode
what does skewed density curve mean?
mean does NOT = median and mode
how is a normal distribution described?
by a normal curve
what is a normal curve? (5)
- symmetrical
- single peaked
- tail meet x axis at infinity
- location determined = mean
- shape determined = standard devation
how to assume if data is normally distributed?
statistical tests
what to do if data is not normally distributed
use a non parametric test
what are z scores / standard scores?
allow us to compare values from different data sets
what is the standard score z of an observation ?
deviation of x from mean / standard deviation
how to get a standard normal distribution
standardising all values of a normal distribution
what does a standard normal distribution do?
allow us to determine proportions of observations
why are parametric & non parametric tests used?
test significant differences between data sets
what are 2 key things about parametric tests?
- make assumptions about population parameters
- require interval or ratio data
what does a violation of test assumptions leads to?
erroneous interpretations of the data
3 key things about non parametric tests
- make no assumptions on popular parameters
- can use nominal data
- not as powerful as parametric tests
why is the chi square test for goodness of fit used? (4)
- used on unrelated data
- used to answer q about proportions
- used to compare different levels of a variable
- compare sample proportions to population proportions
what are observed frequencies?
number of ppt measured in indivdual catergories
what are expected frequencies
frequencies predicted by the null hypothesis