Lecture 7 Flashcards
what is a factor variable?
a particular type of variable that has 2 or more levels. each of the level acts as a particular category. each level has its own associated label.
eg:
VCE, uni, none
can factor variables be treated with numerical operations? eg: divide by 10
even though factor variables may be given numeric expression (1,2,3) they cannot be treated by numeric operations (will produce ‘NA’ indicating non meaningful result)
unless you put ‘as.numeric’ –> to convert factor variable to numeric format
what is dummy coding for
transform a categorical variable with g categories into a meaningful set of g-1 dummy variables
dummy variables would either have values of 0 or 1
“dummy variables would either have values of 0 or 1” true or false?
true
if you have 10 categories, what is the maximum number of dummy variables you can have?
10-1 = 9
what is contrast () command for
indicate a particular factor variable we would like to create dummy coding on
why is intercept in dummy variable regression analysis meaningful while in arbitrary matrix it’s not?
intercept = the predicted score on DV when individual scores a 0 on ALL IVs in the linear equation
and say in this dummy variable, the non dummy-variabled group (scored 0) is RMHI. hence the intercept corresponds to the predicted MEAN score of people in the RMHI (the variable that is given ‘0’ dummy coding)
INTERCEPT REPRESENT THE MEAN OF THE REFERENCE CATEGORY
what does regression coefficient in dummy coding analysis table correspond to?
the difference of the mean of the IV1 (which is the variable given ‘1’ in the dummy coding) and IV2 (which is the variable given ‘0’ in the dummy coding)
if:
- in linear analysis intercept is 14 and regression coefficient it 0.1
- ARMP is the dummy variable (given ‘1’ in dummy coding)
- RMHI is the reference variable (given ‘o’ in the dummy coding)
what can you indicate (in terms of statistics) from these info?
- the estimated mean for RMHI is 14 –> intercept
- the estimated mean for ARMP is 14+0.1 = 14.1
- the difference between mean RMHI and mean ARMP is 0.1 –> regression coefficient
what is regression coefficient?
the expected change in DV for 1 unit change in IV
why does regression coefficient value represent in dummy variable?
dummy variable ONLY has value of 0 OR 1
hence, the regression coefficient represents the expected change from 0 to 1 (1 UNIT), which is a change on DV from RMHI (reference category) to ARMP (the dummy variable)
what are alternatives to dummy coding
contr. SAS - reverse dummy coding: make the last level the reference category instead of the 1st level
contr. helmert - use of negative and positive values integers (instead of just 0 and 1), and when summed up across each dummy variable, would add up to 0. this approach compares each level to the average of its PREVIOUS LEVELS.
and the dummy variable is not labelled the name, it’s labelled by the dummy coding itself (numerical, eg: 1, 2, 3)
what does one-way analysis mean
only one group classification
what does between-subject mean
groups are independent
what is an omnibus approach?
- assume that MEANS of all groups are the SAME.
- unfocussed, not informative RQ because if there is a difference, we dont know which direction..
- omnibus approach can only identify an INCONSISTENCY between data and the assumption that all means are the same
eg:
- is there a diff in statistical self efficacy according to prior experience in maths among RMHI and ARMP students?
- At least one unidentified group mean is different from all remaining group means
what is a focussed approach?
provides identifiable differenced
can explain everything in omnibus approach
eg:
- is there a diff among RMHI and ARMP between those with no experience in maths and those who have done either VCE or Uni?
- is there a diff between students with no experience in maths and those with uni maths experience among RMHI and ARMP students?