SOCI 3040 - Finals Review Flashcards by yee yee

Controlling for Variables

Multiple regression accounts for relationship between independent variables and allows to sort out how much variation in the dependent variable is attributable to each independent variable separately.

How well did you know this?

Not at all

Perfectly

Interpreting Multiple Regression

1) analyzing the correlation and directionality of the data
2) estimating the model i.e., fitting the line, and
3) evaluating the validity and usefulness of the model.

How well did you know this?

Not at all

Perfectly

Standardized Coefficients (BETA)

a way of converting the slope of coefficient to standard coefficient.

How well did you know this?

Not at all

Perfectly

Dummy Variables

Acts as a switch variable.

How well did you know this?

Not at all

Perfectly

Reference Category/Group

A category that is theoretically important in the researcher’s view also could be converted into the reference group

How well did you know this?

Not at all

Perfectly

Nested Regression

One way to build a regression model of the given data is to generate a series of regressions where: The dependent variable Y will remain the same. More independent variables will be added at each step to improve the model, keeping all the previous IVs in place.

How well did you know this?

Not at all

Perfectly

Parsimonious Factors

The simplest and the most efficient plausible explanations.

How well did you know this?

Not at all

Perfectly

Regression Assumptions

(1) A regression model strives to achieve a high degree of explained variance, encompassed by R2.
(2) A good regression model should avoid the issue of collinearity (correlation) between the independent variables.
(3) Normally Distributed

How well did you know this?

Not at all

Perfectly

Centring Independent Variables

center around it mean instead of 0.

How well did you know this?

Not at all

Perfectly

Hypothesis Testing in Regression

used to confirm if our beta coefficients are significant in a linear regression model.

How well did you know this?

Not at all

Perfectly

R-Squared/Coefficient of Determination

statistics based on variation that summarize how well the regression matches the data. Shows how well the relationship between two variables fits the straight line.

How well did you know this?

Not at all

Perfectly

Influential Cases

Cases that have a large effect on the model + introduce bias (may have a high residual + be an outlier).
- Have high leverage.

How well did you know this?

Not at all

Perfectly

Zero-order/Partial Correlation

The correlation within the sample overall is called zero-order correlation, and the correlation for each subgroup in the sample is called partial correlation.

How well did you know this?

Not at all

Perfectly

Spearman’s Rank-order Correlation

Popular, but requires several assumptions to be valid:
(1) Both variables must be continuous or at least count variables w/ a wide range of values
(2) Must have a linear relationship
(3) Must be normally distributed

How well did you know this?

Not at all

Perfectly

Reading Correlation Matrices

The diagonal cells always show the correlation of 1. The order of the variables is always the same.

How well did you know this?

Not at all

Perfectly

Adjusted R-Squared

look at the negative impact of adding variables to our model.

Collinearity/Multicollinearity

Occurs when the linear relationship between one independent variable (e.g. age) and the dependent variable Y (e.g. weekly wages) is very similar to the relationship between another variable (e.g. year of birth)

Variance Inflation Factor (VIF)

gives the estimation of how much the variance of the slope coefficient is likely to be inflated because the independent variable is correlated with other IVs in the regression

Multiple Regression Model

an equation that describes the relationship between a dependent variable and more than one independent variable.

Monotonic Relationships

a consistent increase on one variable is associated with a consistent increase (or decrease) on the other variable. E.g., linear relationships

Limitations of Chi-square

It is sensitive to small sample sizes. Likely to give false positive.

Chi-square hypothesis testing

compares the observed frequencies in a table to the expected frequencies we could receive if the two variables are independent in the population

Elaboration Model

Looks at how the relationship between variables X and Y changes after controlling for a third variable Z (replication, specification, distortion, etc).

CHI-Square Test of Independence

a statistical test in which both variables are categorical. This test generally tells us if the distribution of participants across categories is different from what would happen if there were no difference between the groups

Proportionate Reduction of Error Measure (PRE)

shows how much the error in predicting the attributes of the dependent variable can be reduced if the attributes of the independent variable are known.

The Hypothesis Testing

To find the difference in group means.

Steps in Hypothesis Testing

(1) Set up the H₀ and Hₐ (2) F-distribution and decide on the critical value. F𝒸ᵣ will depend on the α (usually, less than 0.05) level of significance and degrees of freedom. (3) Calculate the F-statistics based on variation between groups, within groups, total variation and the degrees of freedom. (4) compare p-value to a. (5) Decide whether to reject the null hypothesis

F-Distribution in ANOVA

a theoretical distribution used to compare differences between multiple means. Used to find the likelihood of randomly selecting a sample with the observed ratio of between-group variation to within-group variation

Post-Hoc Test

- Practically the LSD test. - A means of comparing all possible pairs of groups to determine which ones differ significantly from each other

Between Group

the difference between the group means

Within Group

shows how the cases distribute around the mean in each group

Residuals

Not normally distributed indicate that there might be one or more important IVs that were omitted from the regression.

Pearson's Correlation Coefficient (R)

shows the strength and direction of linear relationship between two variables. A coefficient of -1/+1 shows that there is a perfect negative/positive relationship between the variables, and the coefficient of 0 indicates no relationship.