Final-lecture 12 Flashcards
What are dummy variables or indicator variables?
-nominal level variables
What is the only type of statistics dummy variables can be used in?
-multiple regression models
What is dichotomous and polytomous?
- only two categories
- three or more categories
How would sex be coded?
- 0 would be the reference category
- 1 would be the category of interest
What does the coding for dichotomous dummy variables tell us?
-it indicates the absence or presence of a characteristic/trait
When there are only two categories for a dummy variable, what can be determined?
-the extent of difference
What is the reference group?
-coded as 0
What would a response of 1 indicate if 1=female and 0=male?
-the person is 100% more female than a person who is male
Can we measure the difference between dichotomous dummy variable response categories?
-Yes, since 0 is the reference group and 1 is the experimental group
What should categories be when coding for dummy variables?
-mutually exclusive
In SPSS, if male=1 and female=2 what would we need to do?
-change into 0 and 1
Explain what the two variables are when the independent variable is a dichotomous dummy variable and how each variable relates to Y (use sex as the independent)?
- first variable is Y
- second is the dichotomous dummy variable which has two categories
- we ask if the mean of Y for females is different than the mean of Y for males
If means are different between both females and males Y values, what do we need to do to determine if the mean difference is in the population?
- run a test of significance
- t-test
Is it possible to observe a difference in a sample even if there isn’t a difference in the population?
- it is do to random chance
- sampling variability can lead to variation in sample means
What is hypothesis testing?
-the likelihood we would observe a sample difference if there is no difference in the population
What is the null hypothesis?
-population mean 1 is equal to population mean 2
To test the null hypothesis for difference between groups what do we use?
-a t-test
How can we control for other factors if we wanted to test sex and its impact on days of work limited by health?
-use multiple regression to control for other factors
Do we control for nominal variables in the same way as interval-ratio variables?
-No
What achieves the same test as a t-test of difference in means?
-bivariate OLS regression
Normally, what does OLS regression examine?
-the mean change in Y with a one unit change in X
What kind of statistic is the mean?
-least-squares statistic
If you have two categories within your X variable (dichotomous dummy variable) what does the prediction do?
- the prediction minimizes the error for the first group and is the mean of Y for that group
- same thing with the second group
What does the least squares line pass through with a dichotomous dummy independent variable?
-two means of each group to minimize error
What does b become when testing a dichotomous dummy variable?
-the difference in group means
What does the t-test for b become when testing a dichotomous dummy variable?
-the t-test for b is a test of difference in group means
In multiple regression, what does the constant equal to?
-the mean of the reference category
What is the same as the coefficient for the independent variable in bivariate regression with dummy variables?
- the difference in group means observed in the t-test
- i.e. mean difference
What does a + b equal in a bivariate dichotomous regression with dummy variables, use sex with female=1?
-it equals the mean for females (if females are coded as 1)
How can one standardize dummy variables?
- standardizing Y
- creates a semi-standardized coefficient for X that indicates differences in means of standardized outcome
What do semi-standardized coefficients indicate?
-mean difference in groups in terms of standard deviations of the outcome
What can dummy variables be used for in a multiple regression?
-take into account possible causes of spuriousness that affect the relationship between predictors and dependent variable
For dummy variables, what tests do you run and in what order?
- first a t-test between dependent and independent
- then a regression of them if previous step was significant
If we control for a second independent variable in our regression between a dichotomous dummy variable and dependent variable what is it equivalent to studying? For example, education and mastery and we control for age
- it is equivalent to examining educational differences for individuals who are at the same level of age
- constant has no meaning
What is the most important thing to observe when researching for spuriousness when controlling for a second independent variable?
-the slope for the independent variable becomes non-significant in model 2