measurement terms Flashcards
convergent validity
does it least partially measure the concept
face validity:
does it seem plausible
discrimination validity
distinguish between what you’re measuring and how it is different from something else (i.e. government production speed and effectiveness)
consensual validity
is it broadly accepted (consensus)
correlational validity:
good if you don’t have consensus, for a new measure) does it track with other accepted measures? (think convergent but on a large scale and slightly different)
can statistically compare it to other measures of the same concept withe somewhat similar results, theoretical reason why )
predictive validity:
can we use it to guess things we should be able to (measures say this, does it actually happen. are predictions accurate?) if not its a prob.
ex. of failure: polls and Hillary Clinton
2 major threats of Measurement reliability:
subjectivity: Chile example (instructions and 2 dif. people. person deciding = subjectivity) Desk effect, tests for intercodal reliability, lack of precision : samples are imprecise, build a measure of imperfection of sample into analysis, limits ability to predict but necessary
types of measures
objective, subjective
levels:
binary (0 or 1) dummy variables, interval: counts or continuous (#s most familiar with), ordinal: 1st, 2nd (ranking: ex. warmest, don’t know the difference between each level, just their relevance to others
nominal: can’t do math on: colors, names, variables that are stored and theres distinguishment between them
limitations of data
social desirability bias (racism, truth/lie spinner)
measurement-
assigning #s to phenomena for the purpose of analysis
theories, validity, reliability, types
measurement theories:
need to operationalise concepts (often need multiple measures)
almost always contentious (in political analysis)
usually assumed to contain error.
-more is always better in statistics (multiple indicators)
polity:
how democratic a society is, various measures (minority, contestability of elections)
- people have dif. opinions on what democracy is, how to measure it etc. highly contentious
measurement reliability
do we get hte same measure every time (unless it is actually changed?)
ex. either people or the way it is being measured, measure same thing same way and 2 dif. answers.
objective
(something you can point to that no-one can disagree with. ex) how many people like trump measured by who voted for him. it is an actual number, even if imperfect) vs. subjective (one where someone sits with people and evaluated discussion to determine if they like them or not. v. subjective)
social desirability bias
on’t want to admit they’re racist) so questions may have to lead people to admit it
“the spinner” give everyone a spinner, lie/tell the truth (larger part) is how they answer - compare statistical probability of spinner and compare to responses
how to make frequency distributions
tally observations
define classes
consolidate and display
(use software) (see ex. on chalkboard) (rep w/ -plots (“polygons”), histograms , and cumulative frequency polygons
(plot classes on bottom and # of distrib. / frequency on side)
Measures of Central Tendency
values indicating where in the range the data tends to be
mean = average value
median= middle value
mode=most frequent value
Mean= (sum of all observations)/(# of observations)
can take unrealistic values (1.7 kids)
skewed BY OUTLIERS (CEO salaries)
not apporpriate for some variables
median (middle value)
-value of whatver the middle observeation in the range is
if range is even, take mean of 2 middle observations
-unlike mean, it is NOT WARPED BY OUTLIERS
-usually doesn’t take on unrealistic values
-may not mean much if data is multi modal
Mode- occurs most frequemment
can have more than one mode
- often helpful to relax specificity of “most frequent”when discussing multimodal data and/or data with a wide range of values
- not very useful if data is evenly distributed
kurtosis
(tailedness)- more of data resides within tails
can calculate these with software
Standard Deviation*
measure of average difference from the mean ( mu is mean, lower case sigma is sign for s.d.)
-what can we infer from distance from mean? (interpret data)
if its small, data is tightly clustered around mean (mean explains well)
if its large, data is spread out more (less of data mean explains)
if its larger than mean, data is so far from mean it is not a good measure of centrality
always report standard dev. with the mean
basic law of probability:
given that all poss. outcomes of a given event are equally likely the probability of any specified outcome is the ratio of number of ways that the outcome can be achieved to the total number of ways all poss. outcomes can be achieved.
a priori:
probabilities calculated with full awareness of all possible outcomes (ex. dice, cards etc)
posterior:
probabilities estimated after the fact with limited knowledge of possible outcomes
EXPECTED VALUE:
-expected payout over many outcomes
for one outcome: the value x the prob. of it occurring
for the entire process: sum of all expected values of outcomes
(.e.g. game: costs 75 cents, expected value is 50 cents, not worth it)
both occurring P( A U B) unions
a and b occurring
ex. odds that it will rain or snow today
median voter theory:
ideology is normally distributed (left v. right)
Finding individual probabilities: Z scores
number of standard deviations a value lies from the mean
to calculate subtract mean from value, divide by standard deviation
to figure out what percentage of data lies between mean and this process, consult a table!
normal curve
68.26% are within1 s.d.
95.44% are within 2 s.d.’s
99.72% are within 3 s.d.’s
based on continuous variable
centered on mean outcome
half greater, half smaller than mean
most values are close to the mean
Bermouli Process:
Two, mutually exclusive, jointly exhaustive outcomes
Independent Trials (Cnp*r- with superscripts)
we can use the normal distribution
When np > 10 and n(1-p) > 10
what is a hypothesis
questions regarding some aspect of world we intend to investigate
framed as a FALSIFIABLE statement (can be proven wrong with means available to us)
Null Hypothesis :
contradiction to any research hypothesis
How to test: must have parameters for entire pop you can hypothesise abou
TYPE 1
FALSE REJECTION OF NULL (ERROR)
TYPE 2
FALSE ACCEPTING OF NULL
Steps for a hypothesis test:
-find mean, s.d. and s.e. calculate t score using null as mean -use t score or z if n>30 and find probability (how likely is it relative to alpha) describe findings
independent samples
are samples where selection into one sample does not affect the odds of selection into another survey
dependent samples:
where the selection into 1 sample affects the odds of selection into another sample
ex. pair of surveys taken before and after election.
Difference in Variance: v
variance is the sum of squared differences from the mean
- if variance isn’t the same, it affects how likely the sample means are close to one another even if they are from the same population. variance becomes a particular concern when samples are different sizes or one is very small (<30)
se= sd* sq.rt. (1/n1+1/n2)
pooled difference of means
equal variances
use pooled: pooled s.d. sq. rt. {(n1-1)sˆ2 + (n2-1)sˆ2/ n1+n2-2 }
pooled s.e. = s.d. * sq.rt {1/n1+ 1/n2}
t= mean 1- mean 2/ s.e. d.f.
t score
needs 1 standard error (overall)
if t score generated is less than t score w/ alpha in table
cannot reject null, cannot accept research (within null)
for proportions, null is opposite
(ie. > opp < or=)
t score 2+
digression
n=
(z* s.d. / max error allowed )ˆ2 or (t* s.d. / max error allowed )ˆ2
s.d. of proportion is greatest when
proportion/s.d. = 0.5 (to est. sample size with unknown proportion) if proportion is smaller, don’t need as large of a sample
difference between 2 groups:
testing whether it came from pop or not. i.e. failure to reject null means pop means are not different (can be that they are different) t for these is the difference of the two sample means divided by standard error
independent samples t test difference w/ unequal variances
most conservative, (least likely to reject null) makes type 1 less likely (d.f. very complicated)
- mean, s.d., s.e. for each group
- overall s.e. (sq.rt. of se.ˆ2+seˆ2)
- t score for diff of means (d.f. =n1+n2-2)
- statistically significant at less than…
t test independent equal variances
smaller s.e. and larger t scores (more type 1 error if variances are unequal) Levene test
-mean and s.d. before and after procedures being tested
-calc overall s.d. sq.rt.(n1-1sˆ2 +n2-1sˆ2/ n1+n2-2)
(this is weighted avg of 2 s.d.)
- s.e. overall (s* sq. rt. (1/n+1/n)
-calc t (mean-mean/s.e.)
- if in between t tables amount, give explanation of value in t table (probability)
t test dependent samples
- pairwise subtractions
- mean, s.d. s.e. of those differences
- t score (2nd value is 0, to see if there is a difference)
- statistically significant at less than value above: level of significance
proportions
- means, s.d.{s= sq.rt. [p (1-p)]} of experimental and control groups
- s.e. of each
- overall s.e.
- t of difference between experimental and control
- if t score does not exceed value, the level of significance is used as probability the 2 samples could be from same pop.
nominal, ordinal, interval
can be used as measurements for: mode with all, mean only interval, median with ordinal and interval
statistical controls
3 or more variables’ relationships can be examined