Stata Commands Flashcards

1
Q

what command lets you observe the data

A

browse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

which command shows what variables we have

A

desc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what command counts the number of observations satisfying a condition, lets say for observations that are not foreign and also have miles per gallon less than 25

A

count if foreign != 1 & mpg < 25

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what command do you use to summarize the data

A

sum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how to tabulate a variable to satisfy the condition that the variable is foreign (a binary variable where foreign = 1 when true)

A

tabulate variablename if foreign == 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how to generate a new variable

A

gen newvarname = whateveryouneed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the command for observing correlations

A

corr var1 var2 var3… (as many var as u want)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the command for observing covariances

A

corr var1 var2, cov

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the command to do a t test on a variable’s mean being 0 as the null hypothesis

A

ttest varname == 0

the RHS will be whatever your null hypothesis is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how to do a difference in means t test for h0 being 0
use willingness to spend as the variable and employed or unemployed as the binary variable
basically do a t test to test if the difference in willingness to spend is 0 across people who are employed or unemployed

A

ttest wts, by (employed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how to run an OLS regression fo only 1 regressor
eg how does willingness to spend change with income

A

reg wts income

i.e. reg outcome independentvariable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to create predicted values and then see them

use the regression of income on willingness to spend as example

A

(directly after a reg command)
predict nameyourvariable
br wts predictvarnameabove income

i.e.
predict nameyourvariable
br outcome varfromabove independentvar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how to create predicted residuals

A

(directly after reg command)
predict residual, resid

n.b. the word residual is just what u name it not the command

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how to create a scatter graph

A

twoway scatter outcome independentvar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how to plot an OLS reglession line

A

twoway lfit outcome independentvar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how to make a scatter graph with a regression line through it

A

twoway (lfit outcome independentvar) (scatter outcome independentvar)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How to name your graph

A

xtitle(“name”) add this to the end of the graphing command

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

how to generate a regression table with standard errors to 4dp and beta to 4dp

A

(reg wts income) this is your regression preceding the command
then type:
outreg2 using reg_output.doc, sdec(4) bdec(4)

you can call the .doc file whatever u want

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

how to do a hypothesis test that a linear combination of variables has null hypothesis = 0

for example, 2*income - female + age = 0
which means 2x coefficient of income - coef of female +age

A

lincom 2*income -female +age

stata by default assumes null hypothesis to be 0 so dont specify

must be written after the reg command

20
Q

how to perform multiple hypothesis test (multiple hypotheses)

as example test income coef = 0.005, female coef = -2, age = 0

A

first do the reg command (reg wts income female age)
then do:
test (income = 0.005) (female = -2) (age = 0)

these are all contingent on each other so if one fails the hypothesis test is rejected

21
Q

how to perform multiple hypothesis test that the coefficient for income and female are both 0

A

reg command first then:
test income female

dont need to define as =0 since stata does this auto

22
Q

how to standardise a variable

A

sum variablename (so you can see the sd)
gen variablename_standard = variablename/r(sd)

23
Q

how to full standardise a variable

A

egen varname_full_standard = std(varname)

24
Q

how to split up a variable that has parts to it e.g. a date like november 13 2022

A

split varname

25
Q

how to rename the split up variable eg the dates, lets assume split into 3 parts

A

rename date_sold1 month
rename date_sold2 day
rename date_sold3 year

26
Q

how to replace a comma in a split up variable where there is a comma e.g. 13, in a date

A

replace day = regexr(day, “,”, “ “)

replace varname = regexr(varname…..)

27
Q

how to convert something stata thinks is a word into a number

A

destring varname, replace

28
Q

how to convert months into numbers that stata recognises

A

replace month = “1” if month == “January”

general form:
replace (nameofvariable) = “number associated” if nameofvar == “Relevantmonth”

then:
destring month, replace

29
Q

how to run a regression over only a range of variables
as example use year of house sold between 2016 and 2020

A

reg price bedrooms bathrooms, robust, if inrange(year, 2016, 2020)

general form:
reg outcome iv iv, robust, if inrange(varname, lower bound, upper bound)

30
Q

how to use the inrange function more generally

A

any command… then add if inrange(variable,lowerbound,upperbound)

31
Q

how to restrict a command to only be applied to data in specific areas e.g. specific suburbs

A

add if inlist(varname, “suburb1”, “suburb2”) at end

this will only do the command for data in those two suburbs

32
Q

how to run a regression but one regressor make it a binary variable

A

reg outcome iv1 i.iv2, robust

33
Q

how to run a regression where the outcome is in log

A

gen ln_outcome = ln(outcome)
reg ln_outcome iv1 iv2, robust

34
Q

how to make a table which only shows a variable in the 25th percentile

use example of price and address i.e. show all addresses in the 25th percentile of price

A

sum varofpercentile, detail
tab varofinterest if varofpercentile == r(p25)

sum price, detail
tab address if price == r(p25)

35
Q

How to tell stata you are using panel data

A

egen panel_id = group(varname varname)

xtset panel_id timeidvariable

36
Q

how to create a binary variable for being treated (DiD regressions)

A

gen treated = inlist(varname, “name1”, “name2”)

e.g. gen treated = inlist(mktnam, “Lviv”, “Rvine”)
ukranian cities

37
Q

how to create a binary variable for being after treatment (DiD)

A

gen after = timeidvariable >=timeperiod

e.g.
gen after = days_since_2014 >= 730

38
Q

how to create an interaction term for DiD

A

gen interaction = treated*after

assuming u called your two binary variables treated and after

39
Q

how to run a DiD regression for beetroots in ukraine

A

reg price treated after interaction, robust, if varname == “beetroots”

40
Q

How to create a graph for trends prior to treatment in DiD of control and treated

A

twoway (lfit outcome timeidvariable if treated ==1 & after ==0 &varname == “beetroots/whatever”) (lfit outcome timeidvariable if treated ==0 & after ==0 & varname == “Beetroots/whatever”)

41
Q

how do you run a regression to check for parallel trends pre treatment in DiD

A

gen interact_timeidvar = treated*timeidvar

reg outcome timeidvar treated interact_timeidvar, robust, if after ==0 & varname == “beetroots”

42
Q

how to run a fixed effects regression

A

xtreg outcome interactionvarname, fe robust, if varname == “beetroots”

43
Q

how to run a fixed effects regression also controlling for time

A

xtreg outcome interactionvarname i.timeidvariable, fe robust, if varname == “beetroots”

44
Q

how to run a first difference regression

A

reg diff_outcome diff_fake_regressor, robust, if varname == “beetroots”

45
Q

how to run a regression discontinuity (what are the three variables to generate only, not reg function) prob not assessed he will tell us the fake ones

A

gen fake_over = fake_regressor -1

gen over_one = fake_regressor > 1

gen interact_RD = over_one*Fake_over

46
Q

what is the code for running the regression discontinuity

A

reg outcome fake_over over_one interact_RD, robust, if varname == “beetroots”

47
Q

how to run a 2sls regression assuming we have the variables already z1, z2 and control

A

ivregress 2sls price (fake_regressor = z1 z2) control, robust, if varname == “Beetroots”