WEEK 4 Flashcards
key parts of a paper
intro, tables and figures, rest is for reverence
purpose of lit review and theory
demonstrate plausibility of hypothesis
purpose of research design
provide blueprint for replicability
purpose of results and conclusions
demonstrate researcher competency
to evaluate the causal relationship
look at predictions and probabilities
what do plots show: marginal effect plots
IV (X axis) DV (y axis)
regression lines and confidence intervals ; predicted values of y given values of x
if y is a binary variable, the prediction is a probability
does not depict control variables, but accounts for them in compiling the image
what do plots show: coefficient plots
IV (x-axis) and DV (y-axis)
Predicted coefficient of IV and all control variables
confidence intervals for each coefficient
USEFUL For: OLS, logistic, Poisson
what to use when evaluating INTERACTIVE EFFECTS
Marginal effects plot
what are coefficient plots commonly used in:
OLS Regression: coefficients are in original units (e.g., years of education → +0.5 on income)
Logistic Regression: coefficients are in log-odds
Poisson/Negative Binomial: coefficients are in log count or log IRR
logit and probit models
Both are types of regression models used when your dependent variable (DV) is binary (i.e., 0 or 1, like yes/no, vote/don’t vote, support/don’t support).
Coefficients are in log-odds
what do tables show
Coefficients, standard errors, test statistics, and significance of each IV
The intercept (value of “y” when each “x” equals 0) and its errors and significance
Markers of significance and goodness of fit for entire model
histogram
shows the distribution of independent variables, and how many observations are at each value
they can skew in one direction or another
leftward skew can mean that the bulk of the values are concentrated around a certain closer to 0 than to 100
z-score
results tell you how many standard deviations value is from the mean
only gives result of 1 observation
cant do for bivariate / multivariate hypothesis
t-test
GOAL: compares variation across two groups
- basically compares two groups
can compare with model and without that variable
get it by doing divide coefficient by standard error
one tailed vs two tailed test
in assessing critical value of t-test (comparing 2 samples), have to specify if it is a one-tailed or two-tailed test
one-tailed = population mean is below the standard (shows you where a value is above or below – but rarely use)
two-tailed = tells you whether it is different form the mean, in either direct
—— TWO TAILED can determine significance even if the relationship is opposite of hypothesis
goodness of fit
Goodness of fit measures how well your statistical model explains the observed data.
can assess overall strength w/ pearson correlation
2 tests: f-test and r-squared test
r-squared
a goodness of fit test
only works when one IV
multicolinearity
vaariables are overllapping and we cant tell what is causing what
confidence intervals
1.96 standard deviations is the critical value for 95% confidence
range of likely values is the coefficient + or - 1.96 x standard error
usually slimmer with more values
RSS V. TSS V. SEE
total sum of squares: average distance of observations of y to the average y
regression sum of squares: How much variability the model explains
Residual Sum of Squares SSE (SS Error) What’s left over — the unexplained variability (errors)
root mean square data
measure of how uch model misses predictions on average
basically, the average distance of an observation to the reg line