SOCI 3040 - Finals Review Flashcards

1
Q

Controlling for Variables

A

Multiple regression accounts for relationship between independent variables and allows to sort out how much variation in the dependent variable is attributable to each independent variable separately.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Interpreting Multiple Regression

A

1) analyzing the correlation and directionality of the data
2) estimating the model i.e., fitting the line, and
3) evaluating the validity and usefulness of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Standardized Coefficients (BETA)

A

a way of converting the slope of coefficient to standard coefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Dummy Variables

A

Acts as a switch variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Reference Category/Group

A

A category that is theoretically important in the researcher’s view also could be converted into the reference group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nested Regression

A

One way to build a regression model of the given data is to generate a series of regressions where: The dependent variable Y will remain the same. More independent variables will be added at each step to improve the model, keeping all the previous IVs in place.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Parsimonious Factors

A

The simplest and the most efficient plausible explanations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Regression Assumptions

A

(1) A regression model strives to achieve a high degree of explained variance, encompassed by R2.
(2) A good regression model should avoid the issue of collinearity (correlation) between the independent variables.
(3) Normally Distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Centring Independent Variables

A

center around it mean instead of 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Hypothesis Testing in Regression

A

used to confirm if our beta coefficients are significant in a linear regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

R-Squared/Coefficient of Determination

A

statistics based on variation that summarize how well the regression matches the data. Shows how well the relationship between two variables fits the straight line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Influential Cases

A

Cases that have a large effect on the model + introduce bias (may have a high residual + be an outlier).
- Have high leverage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Zero-order/Partial Correlation

A

The correlation within the sample overall is called zero-order correlation, and the correlation for each subgroup in the sample is called partial correlation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Spearman’s Rank-order Correlation

A

Popular, but requires several assumptions to be valid:
(1) Both variables must be continuous or at least count variables w/ a wide range of values
(2) Must have a linear relationship
(3) Must be normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Reading Correlation Matrices

A

The diagonal cells always show the correlation of 1. The order of the variables is always the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Adjusted R-Squared

A

look at the negative impact of adding variables to our model.

17
Q

Collinearity/Multicollinearity

A

Occurs when the linear relationship between one independent variable (e.g. age) and the dependent variable Y (e.g. weekly wages) is very similar to the relationship between another variable (e.g. year of birth)

18
Q

Variance Inflation Factor (VIF)

A

gives the estimation of how much the variance of the slope coefficient is likely to be inflated because the independent variable is correlated with other IVs in the regression

19
Q

Multiple Regression Model

A

an equation that describes the relationship between a dependent variable and more than one independent variable.

20
Q

Monotonic Relationships

A

a consistent increase on one variable is associated with a consistent increase (or decrease) on the other variable. E.g., linear relationships

21
Q

Limitations of Chi-square

A

It is sensitive to small sample sizes. Likely to give false positive.

22
Q

Chi-square hypothesis testing

A

compares the observed frequencies in a table to the expected frequencies we could receive if the two variables are independent in the population

23
Q

Elaboration Model

A

Looks at how the relationship between variables X and Y changes after controlling for a third variable Z (replication, specification, distortion, etc).

24
Q

CHI-Square Test of Independence

A

a statistical test in which both variables are categorical. This test generally tells us if the distribution of participants across categories is different from what would happen if there were no difference between the groups

25
Q

Proportionate Reduction of Error Measure (PRE)

A

shows how much the error in predicting the attributes of the dependent variable can be reduced if the attributes of the independent variable are known.

26
Q

The Hypothesis Testing

A

To find the difference in group means.

27
Q

Steps in Hypothesis Testing

A

(1) Set up the H₀ and Hₐ
(2) F-distribution and decide on the critical value. F𝒸ᵣ will depend on the α (usually, less than 0.05) level of significance and degrees of freedom.
(3) Calculate the F-statistics based on variation between groups, within groups, total variation and the degrees of freedom.
(4) compare p-value to a.
(5) Decide whether to reject the null hypothesis

28
Q

F-Distribution in ANOVA

A

a theoretical distribution used to compare differences between multiple means. Used to find the likelihood of randomly selecting a sample with the observed ratio of between-group variation to within-group variation

29
Q

Post-Hoc Test

A
  • Practically the LSD test.
  • A means of comparing all possible pairs of groups to determine which ones differ significantly from each other
30
Q

Between Group

A

the difference between the group means

31
Q

Within Group

A

shows how the cases distribute around the mean in each group

32
Q

Residuals

A

Not normally distributed indicate that there might be one or more important IVs that were omitted from the regression.

33
Q

Pearson’s Correlation Coefficient (R)

A

shows the strength and direction of linear relationship between two variables. A coefficient of -1/+1 shows that there is a perfect negative/positive relationship between the variables, and the coefficient of 0 indicates no relationship.