Week 1 Flashcards
Prior knowledge
Bayesian approach. “already known” information.
Posterior distribution
Prior knowledge is updates with the information in the data
Advantages & disadvantages prior knowledge
Pro: accumulating knowledge & more powered
:( : results depend on the choice of prior
Frequentist Pr(data|H0)
P-value. Probability of observing same or more extreme data given that the null is true
Bayesian Pr (Hj|data)
Probability that hypothesis Hj is supported by the data
Frequentist probability
Relative frequency
Bayesian probability
Degree of belief
95% confidence interval (frequentist)
If we were to repeat the experiment many times and calculate CI each time, 95% of the intervals will include the true parameter value
95% credible interval (bayesian)
There is 95% probability that the true value is in the credible interval
R squared
…% of the variance in y is explained by the regression model
Adjusted R squared
Corrects for overfitting (having many predictors increase R squared)
Method enter (frequentist)
data analist decides what goes in the model (confirmatory)
Method stepwise (frequentist)
The best prediction model is determined based on results in this sample (exploratory)
B-value
the unstandardized regression B can be used to predict a score on the dependent variable
Beta value
the standardized regression coefficient can be used to determine the relative importance of the predictors
Registered report
develop idea > design study > stage 1 peer review > collect and analyze data > write report > stage 2 peer review > publish report
Cook’s distance
Assumption of no outliers on y-axis. Cook’s distance indicates the overall influence of a respondent on the model. Value must be below 1
Violation of absence of multicollinearity leads to…
Regression coefficients (B) are unreliable
Limits magnitude of R (correlation Y and ^Y)
The importance of individual independent variables can hardly be determined, if at all
Tolerance or VIF
Determining if multicollinearity is an issue:
Tolerance <.2: indicates potential problem
Tolerance <.1: indicates problem
VIF >10: indicates problem
Value of multiple correlation coefficient is same as…
R