Important codes Flashcards by Joy Silva

Do file

This is where we type all of our codes
Allows you to save codes

How well did you know this?

Not at all

Perfectly

How to set up a sheet

Clear all (*removes all previous codes we had done before *)
Right click the file and then click on properties and then copy location (Tell stata where we are saving the file)
Cd “(location)”
Use auto.data, clear (open the data set and clear means if there is already data opened on stat clear it and use the data that we have just opened)

How well did you know this?

Not at all

Perfectly

Run all code in the do file editor
If you want specific codes then highlight the code and press the button
If there is no red once you run the code, then it’s fine

How well did you know this?

Not at all

Perfectly

Browse

Opens spreadsheet so we can see all of our data
Browse (variable) in order to see specific variables
Use br to view many variables

How well did you know this?

Not at all

Perfectly

Code: Describe

Shows our data set
Understand variables
Int- integer

How well did you know this?

Not at all

Perfectly

Code: List

Shows us all of our observations one at a time
Negative: Gives us too much information

How well did you know this?

Not at all

Perfectly

Code: list (** VARIABLES**)

Only shows observation for specific variables

How well did you know this?

Not at all

Perfectly

Code: list (VARIABLES) in 1/(NUMBER OF OBSERVATIONS YOU WANT TO BE INCLUDED)

Restricts to certain observations

EXAMPLES
- 1/l: List on everything
- 70/l: List starts from the 70th observation to the last
(Lowercase L denotes the final observation)

How well did you know this?

Not at all

Perfectly

Code: list (variables) if (outcome of interest)==1

If command, therefore only includes observations that comply with the outcome of interest
Needs 2 equal signs!!

How well did you know this?

Not at all

Perfectly

List code with multiple conditions

!= (value): Not equal to
&: means and
|: means or
==0: Not equal to
List (Anything e.g. a letter)*: Lists any variable that starts with the letter or follows the conditions

How well did you know this?

Not at all

Perfectly

Code: Count

Counts the number of observations satisfying the condition

How well did you know this?

Not at all

Perfectly

Code: Summarise or sum

Descriptive statistics of variables
Able to add if commands
,detail: More detailed breakdown (e.g. median, Skewness)
(On the example there is 1 less rep78 observation as one of the observations has nothing)

How well did you know this?

Not at all

Perfectly

Code: tabulate/tab (Variable of interest)

Breaks down each discrete variable into what values they take and the frequency, percentage and cumulative percentage
Apply to apply if
Used for 2 variables which shows a cross tab (first variable is on the rows sand the second it the rows)

How well did you know this?

Not at all

Perfectly

Code: gen (title of the new variable) = function

Makes a new variable
1 equal sign: assigning the variable to be equal to a certain function
2 equal signs: Equality already holds, checking if it holds
If you want a space (put _)
Able to multiply variables by putting * between them (called an interaction)
Running code again doesn’t lead to 2 duplicated variables as in the beginning we clear all

How well did you know this?

Not at all

Perfectly

Code: Replace (variable name) = (function in which the variable changes)

If you make a mistake, run a clear all and then do the variable again

How well did you know this?

Not at all

Perfectly

How to remove blank values when generating a new variable?
(1: Less than or equal to 2 or 0 if rep 78 is greater than 3)

Use replace and mention the Newley generated function and then the original function (using 2 equal signs for this one)
Or you could add if (variable we want to code) !=.

How well did you know this?

Not at all

Perfectly

CODE: Corr (variables )

Tells us the correlation
Adding “, cov” at the end tells us the covariance

How well did you know this?

Not at all

Perfectly

How to save the new version of data

Save (name of the file), replace
- replace means that we are replacing the previous data
- Never write the name auto as we do not want to overwrite the original data so give it a new data
Under cd write: capture log close
- If we are currently creating a log file close it and if there is not a log file, ignore that
Log using (log name).log, replace text
-Creating a log file in log form in which we can read. If a file is already with the same name we replace the text
Log close (At the bottom)

How well did you know this?

Not at all

Perfectly

Running a t -test in which we test if a variable is equal to a certain value

ttest (variable we are testing) ==0
- Provided with standard deviation (measure of spread of all values)
- Provided with standard error (measure of precision of the average across samples)
- Provided with t value (hypothesised mean minus observed mean divided by standard error)
- Provides confidence interval
- Ha: mean !=0: Two sided alternative, this then shows the probability of getting the test statistic or one more extreme. If it’s less than 0.05 than we reject the null hypothesis.
- Ha: mean < 0 is a one sided test assuming that the mean is positive

How well did you know this?

Not at all

Perfectly

T-test for a difference in mean

Ttest (variable we are testing), by (variable we are separating to find the difference between groups. E.g. Binary variable)

How well did you know this?

Not at all

Perfectly

Code: Reg (outcome variable, independent variables)

Constant is worked out through: Average value of outcome minus (coefficient estimate for the independent variable * average value of the independent variable )

How well did you know this?

Not at all

Perfectly

Code: Predict (name of the new variable E.g. y_hat)

Needs to be run after a reg command
predict residual, resid: Needs to be underneath a reg command

How well did you know this?

Not at all

Perfectly

Show properties of regressors

How well did you know this?

Not at all

Perfectly

Code: twoway
Creating a graph

scatter (outcome variable regressor): Scatter graph
lift (outcome variable regressor): Regression line
graphregion(color(white)): Change background colour to white (needs to be after a comma)

How well did you know this?

Not at all

Perfectly

Difference between 1 equal sign and 2 equal signs?

=: Assign this value to be equal ==: If equality already holds

Omitted variable bias

Auxiliary regression: Reg (omitted variable) (regressor of interest) If you then type local coef = _b[variable]/ it will save the value of the coefficient for the variable, then you have to run your code all together

Saving in a document file Code: outreg

Use reg to tell stata that we are creating a new document, therefore we type replace (Not typing replace adds on a new column, typically done when adding a new control) Document tells us if something is significant through asterisk (*) sdec (4): standard errors with 4 decimal places Bdec (4): Beta to have 4 decimal places

Hypothesis test for coefficent

Code: Lincom

Write that the variable associated with the coefficient

Code: test

Test in which the null hypothesis is requires multiple equal signs and there is the use of the word and First need to run the regression to show stata which regression we are talking about Want to tester multiple hypothesis at the same time. Can use F-test for single hypothesis as-well but just a single bracket with the equal sign and what the hypothesised value is

Standardising

Egen is standardising R(sd) refers to the standard deviation from the previous sum command. When running a reg and interpreting the coefficient, mention that the specific variable increasing by 1 standard deviation is equal to the outcome variable increasing by the coefficient holding fixed other variables. We can also standardise the outcome. Makes it easier to understand

What is normalising?

Transforming the variable to make it easier to read

Code: Egen

Creates a variable that is the average value If instead of mean there was an sd, then it would be the standard deviation of them

Code: Insheet

Insheet using name of data.csv, clear - Open a new dataset - Excel format therefore we need a cvs name

Code: Split VARIABLE

Creating separate variables for the day month and year: Rename: Change the name of a variable rename (original variable name) (new variable name) - replace VARIABLE = regexr (variable, “what we want to take out”, “what we want to put in”)

Code: Destring variable, replace

Command tells stata that a certain variable is actually a number Used with split command

When we browse, why are certain variables in red?

Stata thinks that this variable is just words and not numbers (E.g. Months )

Changing months in numerical values

Add destring code at the end

Code: Inrange

, if I range (variable, starting point number, ending point number) Includes the first and last number

Code: Inlist

For words

What should your add at the end of any regression code?

,Robust to show the use of robust standard errors

Binary variables i.

I. Create a dummy variable for each possible outcome

Code: xi

Stat also creates variables which you can browse

Functional form

Finding observation of percentiles

Scalar name of constant =_b[coefficient name]

- Display needs to be after the reg command but not scalar

Egen panel_id= group (variable variable)

Gives each possible combination of each variable a specific ID code

Xtset panelidvariable timeidvariable

Informs stata that we have panel data

Diff-in-diff Binary variable for treated

Gen variablename = Inlist (variable we are focusing on, values that we want our binary variable to equal 1 for)

Diff and Diff Binary variable for being after treated

Binary variable that is either 1 or 0 depending on if it’s after or before treatment

Diff and Diff regression:

Reg outcomevariable treated after interaction, robust, if variable == “specific value/word that is the treated group”

Diff and Diff Generating the interaction term

Multiplying treated and control Gen interaction = treated*after

Diff and Diff Graph for the trends before treatment

Two way (lift outcome binary variable(for time) if treated ==1 & after == 0 & cname == “Treated group”) (lift outcome binary variable(for time) if treated ==0 & after == 0 & cname == “Treated group”) Trend in outcome for the treated group prior to treatment as well as the trend in prices for the control group