WEEK #3 - research methods Flashcards
how many levels of measurement are there ?
4
what are the four levels of measurement ?
- nominal
- original
- interval
- ratio
define nominal data :
- categorial data with no implicit ordering
- cannot be added, subtracted, multiplied or divided
- can be summarized using mode only
define ordinal data :
- categorial data with implicit (or explicit) ordering
- unequal distance between points
- cannot be added, subtracted, multiplied or divided
- can be summarized with median or mode
define interval data :
- continuous (equal distance between points)
- no meaningful zero
- can be added or subtracted
- cannot be multiplied or divided
- can be summarized with mean, median or mode
define ratio data :
- continuous (equal distance between points)
- meaningful zero
- can be added, subtracted, multiplied and divided
- can be summarized with mean, median, or mode
give an example of nominal data :
“25 animals” (10 dogs and 15 cats)
give an example of ordinal data :
“positions in a race : 1st, 2nd, 3rd”
give an example of interval data :
temperature in celcius
give an example of ratio data :
temperature in Calvin
which level of measurement has no meaningful zero ?
interval data
which level of measurement has a meaningful zero ?
ratio data
when might you be able to treat ordinal as interval data ?
- you are aggregating multiple items
- the underlying construct is continuous
- the measurement instrument is reliable
what are the three M’s of central tendency ?
mean, median and mode
what is mean ?
the arithmetic average of the data
what is median ?
the point that divides the data in half and the 50th percentile
what is mode ?
the most frequently occurring value
which central tendency is “ total all the results and divide the number of units or “n” of the sample” ?
mean
which central tendency is “the exact middle score in a data-set and list all scores in numerical order, and then locate the score in the centre of the sample”
median
does median ignores the outliers compared to a average ?
yes ignores the outliers
which central tendency is “the most repeated score in the set of results, 15 is the most repeated score and is labeled the mode and if you have a “tie” for “most repeated score”, you will have more than one mode”
mode
in regards to normality and central tendency, what does it mean if the distribution is normal ?
the mean, median and mode are all equal (bell-shaped)
what are the three factors of dispersion ?
- range
- standard deviation
- coefficient of variation
why do we use range :
good for an intuitive description of minimum and maximum values in a data set
why do we use coefficient of variation ?
a useful way of comparing standard deviations across populations with different means or units
why do we use standard deviation ?
more accurate/detailed description of dispersion the takes “outliers” into account
define range :
the range is the difference between the highest and th lowest scores within a variable
define standard deviation :
a value that shows the relation the individual scores have to the mean of the sample
(if scores are said to be standardized to a normal curve, then there are several statistical techniques that can be used to analyze the data set)
TRUE OR FALSE
SD is calculated across all scores as the square root of the sum of the squared deviations from the mean, divided by the number of scores
TRUE
what do we represent the population value with ?
the greek letter sigma Σ
what do we represent the sample value with ?
the letter “s”
TRUE OR FALSE
the standard deviation of a measure is dependent upon its scale (the magnitude of the values within the data)
TRUE
what is distributional shape ?
Measures of shape describe the distribution (or pattern) of the data within a dataset
what is normal distribution ?
- sometimes called a “bell curved”
- upper and lower halves perfectly symmetrical
- most common normal distribution is the standard normal distribution
TRUE OR FALSE
for a normal distribution we see a mean, median and mode of 0 and standard deviation of 1
TRUE (only the case in standard normal distributions)
what is the empirical rule ?
a statistical rule that states that almost all observed data for a normal distribution will fall within three standard deviations
describe the three points of the empirical rule :
- 68% of the data falls within 1 SD of the mean
- 95% of the data falls within 2 SD of mean
- 99.7% of the data falls within 3 SD of mean
TRUE OR FALSE
causation does equal correlation
FALSE
it does not
describe the statement “causation doesn’t equal correlation”
refers to the inability to legitimately deduce a cause-and-effect relationship between two events or variables solely on the basis of an observed association or correlation between them
what does SD stand for ?
standard deviation
what are skewness ?
a measure of the asymmetry of the distribution
what does the skewness graph look like ?
extent to which one “tail” is longer than the other
what does a positive skew look like ?
right tail longer
what does a negative skew look like ?
left tail longer
generally with a - skew ; what does the mean with median and mean ?
- skew = median > mean
generally with a = skew ; what does that mean with the median and mean
= skew = mean > median
what is kurtosis ?
measures the peak
what are the three kurtotic distributions have non-normal “peaks” :
- platykurtotic
- leptokurtotic
- mesokurtotic
what does “platykurtotic” mean ?
“flat” : highly negative kurtosis
what does “leptokurtotic” mean ?
“pointed” : highly positive kurtosis
what does “mesokurtotic” mean ?
“no” kurtosis - ‘normal’ distribution
describe in simple terms the difference between kurtosis and skewness :
skewness measures the tails of the distribution, and kurtosis measures the peak
TRUE OR FALSE
kurtotic distributions have non-normal “peaks”
TRUE
what are outliers ?
are values that fall substantially outside the range of most other values in the data
how does one identify outliers ?
recall that the empirical rules states that 99.7% of the data will fall within 3 SD of the mean
what are the three graphical summaries of data ?
- bar graphs and histograms
- line graphs
- box plots
what is a histogram ?
compares multiple measurements of the same variable (e.g. describing the age range in sample)
what is a bar graph ?
compares multiple variables (e.g. the relative frequency of test usage within a group of practitioners)
in simplest terms whats the difference between a histogram and a bar graph ?
a histogram has same one variables while a bar graph has multiples variables
what is stem and leaf ?
- basically contracted as a vertical histogram
- shows raw data and gives rough idea of dispersion
- very old school
- print bar graphs back in the day
when do we use and see line graphs ?
often used to convey temporal information
TRUE OR FALSE
line graphs should not be used for discrete variables ?
TRUE