6A Flashcards
What are the three main topics of this lecture?
- Regression with binary categorical independent variables. 2. Regression with non-binary categorical independent variables. 3. How to handle ordinal variables in regression.
What are the two key characteristics used to classify variables?
- Logical order – Do the values follow a natural ranking?
- Equal distances – Are the gaps between values consistent?
What are examples of variables with a logical order?
Monthly income, education level, left-right identification, immigration attitudes.
What are examples of variables without a logical order?
Vote choice (e.g., VVD, CDA, PVV), religion (e.g., Catholic, Protestant, Muslim).
What are examples of variables with equal distances?
Monthly income, left-right identification, immigration attitudes.
What are examples of variables without equal distances?
Educational level (e.g., primary, secondary, higher education).
What are the three types of variables?
- Categorical – No logical order (e.g., vote choice). 2. Ordinal – Logical order, but unequal distances (e.g., education level). 3. Continuous – Logical order with equal distances (e.g., income).
What are binary categorical variables?
Variables with only two categories, coded as 0 and 1 (e.g., voted = 1, did not vote = 0).
What are non-binary categorical variables?
Variables with more than two categories (e.g., religion: Catholic, Protestant, Muslim).
What are dummy variables (dichotomous variables)?
Binary categorical variables coded as 0 and 1 for use in regression.
Why should the dependent variable (Y) always be continuous in linear regression?
Linear regression assumes that Y follows a continuous distribution, allowing for meaningful interpretation of coefficients.
How can binary categorical variables be included in regression?
They are directly included as independent variables, coded as 0 and 1, and interpreted the same way as continuous variables.
What does the intercept represent in a regression with a binary independent variable?
The predicted value of Y when the binary variable equals 0 (i.e., the reference group).
What does the coefficient (b₁) of a binary independent variable represent?
The difference in the dependent variable (Y) between the 1 and 0 groups.
How is regression with a binary independent variable related to a t-test?
A simple regression with a binary independent variable produces the same results as an independent-samples t-test.
Why can non-binary categorical variables not be used directly in regression?
Regression requires numerical input, so categorical variables must be converted into multiple binary variables (dummy variables).
What is a reference category in dummy coding?
The category that is not assigned a dummy variable. All other categories are compared against it.
What happens if a respondent has 0s on all dummy variables?
They belong to the reference category.
How are dummy variables used in regression?
Each category (except the reference) gets a binary variable (1 = in category, 0 = not in category).
What is an example of a dummy variable transformation?
If religion has four categories (Catholic, Protestant, Muslim, Non-religious), three dummy variables are created: Catholic (1/0), Protestant (1/0), Muslim (1/0), with Non-religious as the reference category.
How is the regression coefficient of a dummy variable interpreted?
The coefficient represents the difference in the dependent variable (Y) between that category and the reference category.
What is the advantage of using SPSS’s automatic dummy variable coding?
- It is faster and easier than manual coding. 2. It provides a p-value for the entire categorical variable. 3. It provides means for each category, aiding interpretation.
What does the F-test in regression measure for categorical variables?
It tests whether the categorical variable as a whole (all dummy variables together) has a significant effect on Y.
What does R² measure in regression?
The proportion of variance in the dependent variable (Y) explained by the independent variables.