Week 3: Hypothesis Testing and Significance Flashcards
What command generates a new variable?
<gen>
</gen>
How do you recode continuous variables into categorical variables?
Use the <recode> or <replace> command in combination with <if> statements to specify ranges and assign categories.
Example:
recode varname (a/b=1) (c/d=2) (e/f=3), gen(newvarname)</if></replace></recode>
Which command labels a variable?
<label variable varname “name”>
How do you ensure the correct categories after recoding a variable?
Use frequency tables and crosstabs, such as <tab></tab>
What is the syntax to assign value labels to a variable?
<label define labelname value1”label1” value2”label2”, replace>
<label></label>
What does the <egen> command do?</egen>
Creates derived variables, such as sums or counts, across specified variables.
What is the purpose of <foreach> and <forvalues> loops?</forvalues></foreach>
They automate repetitive tasks, such as recoding multiple variables
<foreach> is a more general loop that allows string, numeric, and variables as list (without a pattern).
<forvalues> is a more specific loop. Only numeric is allowed as lists, and lists should have a clear pattern.
</forvalues></foreach>
How are missing values assigned in a new variable during coding?
Use <replace varname =. if condition>
How do you combine two categorical variables into one?
Use a combination of logical operators (I, &) and <recode> or <gen> to conditionally create a new variable.
Example:
<gen newvar=.>
<replace newvar=0 if var1==a | var2==b></gen></recode>
<replace> b & var2==c>
etc...
<label variable newvar "Variable name">
<label define newvar1 0"abcd" 1"efgh">
<label>
</label></replace>
What is the syntax to count specific values across multiple variables?
<egen newvar = anycount(varlist), values(value)>
Example: To count number of those with asthma
<egen asthma = anycount (hedib01-hedib06), values(2)>
Here, we know that (2) denotes the presence of asthma in the initial variable
How do you compute the sum of multiple variables into a single variable?
Use <egen newvar = rowtotal(varlist)>
Example:
<egen tot_heart_sum = rowtotal(heart_*)>
What command is used to produce a frequency table?
<tab>
</tab>
What does reverse coding ensure in survey data?
It aligns differently phrased questions for consistent interpretation in scales or indices.
How do you check for errors after creating a new variable?
Cross-tabulate the new variable with its original counterpart and examine discrepancies.
What does the <rowtotal> function in <egen> calculate?</egen></rowtotal>
The sum of specified variables across a row for each observation.
What is a “derived variable”?
A variable created from existing data, often through transformations or aggregations.
How can you handle negative or missing values during recoding?
Use conditional statements like <replace varname =. if varname <0>
What does the command <label> do?</label>
It creates a set of value labels that can be assigned to variables
Why is it good practice to verify recoding with cross tabs?
Ensure all values have been correctly categorised
What is the first step to take in deriving a new variable?
Create a new ‘empty’ variable using the command <gen varname =.>
How do you delete a variable if you made a mistake in recoding?
<drop>
</drop>
How would you calculate the mean of a continuous variable sorted by a categorical variable?
<mean contvar, over(catvar)>
Note: You can also use logical operators e.g., mean <varname if cond1==2 | cond2==3>
What character do you use to capture all variables that start with the same letters?
Example: If you want to capture all variables starting with hedia <hedia*>
How can you create a single variable showing the number of heart conditions a respondent has in the ELSA dataset? (hedia0-7)
To count the number of conditions a respondent has, we need to recode the seven hedia variables into seven new variable, each denoting the presence or absence of a heart condition of any type. The scores will be 1 for any heart condition and 0 for no heart condition.
Command:
<recode hedia01 (96=0) (1/8=1), gen(heart01)>
<recode hedia01 (96=0) (1/8=1), gen(heart02)>
etc…
How can you create a single variable showing the number of heart conditions a respondent has in the ELSA dataset? (hedia0-7) - using loops
<forvalues j=1/7{
recode hedia0j' (96=0) (1/8-1) (95=1) (min/-1=.), gen heart_
j’)