begriffe Flashcards
validity
extent to which a concept/measurement is well-foundend and likely corresponds accurately to the real world
internal vs external validity
internal validity
the obtained effect of x on y for your sample is the correct effect for the sample
-> generalization of causal findings to all cases WITHIN the sample
how to obtain:
-empirical model is correctly specified, estimators are unbiased
-> changes in the dependent variable are attributed to the independent variable (and no other factors ->challenge to eliminate that chance)
external validity
obtained effect of x on y in the sample is the correct effeft of x on y in the population P
-> generalization of causal findings to other cases not included in the sample -> the overall population
how to obtain:
-enough cases
-sample represents the population in all relevant characteristics
why is validity important
-theory and findings need to show a causal effect for the research to be relevant
-stakeholders need to know whether it also holds for other cases
-in practice: experiments usually of low external but high internal validity or neither perfect internal nor external validity
validity vs reliablility
reliability is the degree of precision with wich a specific aspect is measured
advantages of scientific observation
systematic approach of observing and generating information
-objectivity as oppose to selective set ob observations
-avoidance of “filling in” information
-verifiability
population
all observational unit to whom the theory is assumed to apply
sample
a subset of the theoretically-defined population for which data is assessed
for reasons of validity, we want this subset to be representative of the population
descriptive statistics
inferential statistics
what is data
quantified information
information for one single case: date point
manifest variables: directly observable variables (zb body height)
latent variables: abstract concepts only observable through manifest indicators (zb democracy etc)
data types by source
source:
observable world -> observational data
field or lab experiment -> experimental data
an algorithm -> simulated data
data processing
-> to eliminate sources of error
processing includes:
-reduction of measurement error
-addressing of inter-coder reliablitity
-elimination of missing data points
-identification of outliers
how to measure data
measurements require
-measurement scale
-measurement unit
-measurement instrument
also includes
-counts
-quantifiactions
types of variables
can be descriped by three elements: instrument, measurement unit, scale
variables by scale
categorical variables
how are observations arranged?
nominal variables
-numerical values are used as a label or type of attributes
-no intrinsic order between categories
-zb gender, party affiliation: spö=1, övp=2
ordinal variables:
-variables of two ore more catagories which can be ranked
-value and gap is not interpretable
zb smart (no twice as smart)
variables by scale
metric variables
interval variables
-variables have a zero value (usually without a clear meaning)
-distance between attributes has the same meaning
ratio variables
-zero means thet there is nothing of this variable left
verwende datenset thedata.dta
use thedata.dta
delete all variables and data
clear
zusammenfassen eines datensets
describe, short
describe, simple
summarize
sum, detail
tabulate
list
codebook
excel datenset importieren
import excel “…”
var1 und var2 entfernen
drop var1 var2