CH 4 Flashcards
Three different kinds of data you can have?
Univariate= one variable Bivariate= two variables Multivariates= more than two variables
3 questions to ask before starting a data analysis
1) What type of data do I have? (uni,bi,multi)
2) what types of variables do I have (nominal, ordinal, ratio, etc)
3) What is my research aim
Frequency
The number of times that a particular value or score on our variable of interest was obtained within our sample
-F(r)=F
Relative Frequency
the proportion of scores or values in our sample tat take on a particular value
-F(r)=F/N
Percentage Frequency
The percentages of scores or values in our sample that take on a particular value
-F(r)= F/N x 100
Cumulative frequency
The number of scores or values in our sample that take on a value at or below a given value
Cumulative Relative frequency
the sum of the relative frequencies for all values that are less than or equal to the given value
Cumulative Relative Percentage
is another way of expressing frequency distribution. It calculates the percentage of the cumulative frequency within each interval, much as relative frequency distribution calculates the percentage of frequency.
What are the terms for x axis? y axis
x axis- abscissa
y axis- ordinate
Symmetry
refers to whether the right and left side of your data look the same- could you split it onto itself over the middle point? like a mirror image (empirical distributions are rarely if ever perf symmetrical)
Skewness
describes the degree top which scores lean towards (or are piled up on) one end of a distribution
Kurtosis
describes how “peaked” a distribution is (or how heavy the tails are). It describes how “lean” or “boxy” they are
mesokurtic
‘normally’ distributed, peaks in middle, distributed evenly
Leptokurtic
bulk of frequency in middle with weak tails
Platykurtic
appears normally distributed but with high frequency (fat tails) at end scores.
- Measures of location
- Measures of central tendency
summarize a data set into a single value at the aggregate level
-describes the location of the data
-measures of location that describe the typical or central value
some measures of location are also measures of central tendency
Mean
sum of all values observed on a variable divided by the total number of values observed on that variable
Median
denoting or relating to a value or quantity lying at the midpoint of a frequency distribution of observed values or quantities, such that there is an equal probability of falling above or below it
Mode
most frequently occurring value on a variable.
- doesnt tell us anything about location of scores that aren’t at the mode
- not influenced by outliers
Measures of dispersion
Deviation score -from mean (Xi- Xmean)
Sum of deviations, sum of (Xi-Xmean)
Average absolute deviation. sum of (Xi-Xmean)/N
Average squared deviation/variance= s^2= sum of (Xi-Xmean)^2/N
Standard deviation: s= square root of s^2