Important codes Flashcards

1
Q

Do file

A

This is where we type all of our codes
Allows you to save codes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to set up a sheet

A
  1. Clear all (*removes all previous codes we had done before *)
  2. Right click the file and then click on properties and then copy location (Tell stata where we are saving the file)
  3. Cd “(location)”
  4. Use auto.data, clear (open the data set and clear means if there is already data opened on stat clear it and use the data that we have just opened)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
A

Run all code in the do file editor
If you want specific codes then highlight the code and press the button
If there is no red once you run the code, then it’s fine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Browse

A

Opens spreadsheet so we can see all of our data
Browse (variable) in order to see specific variables
Use br to view many variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Code: Describe

A

Shows our data set
Understand variables
Int- integer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Code: List

A

Shows us all of our observations one at a time
Negative: Gives us too much information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Code: list (** VARIABLES**)

A

Only shows observation for specific variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Code: list (VARIABLES) in 1/(NUMBER OF OBSERVATIONS YOU WANT TO BE INCLUDED)

A

Restricts to certain observations

EXAMPLES
- 1/l: List on everything
- 70/l: List starts from the 70th observation to the last
(Lowercase L denotes the final observation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Code: list (variables) if (outcome of interest)==1

A

If command, therefore only includes observations that comply with the outcome of interest
Needs 2 equal signs!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

List code with multiple conditions

A

!= (value): Not equal to
&: means and
|: means or
==0: Not equal to
List (Anything e.g. a letter)*: Lists any variable that starts with the letter or follows the conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Code: Count

A

Counts the number of observations satisfying the condition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Code: Summarise or sum

A

Descriptive statistics of variables
Able to add if commands
,detail: More detailed breakdown (e.g. median, Skewness)
(On the example there is 1 less rep78 observation as one of the observations has nothing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Code: tabulate/tab (Variable of interest)

A
  • Breaks down each discrete variable into what values they take and the frequency, percentage and cumulative percentage
  • Apply to apply if
  • Used for 2 variables which shows a cross tab (first variable is on the rows sand the second it the rows)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Code: gen (title of the new variable) = function

A
  • Makes a new variable
  • 1 equal sign: assigning the variable to be equal to a certain function
  • 2 equal signs: Equality already holds, checking if it holds
  • If you want a space (put _)
  • Able to multiply variables by putting * between them (called an interaction)
  • Running code again doesn’t lead to 2 duplicated variables as in the beginning we clear all
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Code: Replace (variable name) = (function in which the variable changes)

A
  • If you make a mistake, run a clear all and then do the variable again
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to remove blank values when generating a new variable?
(1: Less than or equal to 2 or 0 if rep 78 is greater than 3)

A
  • Use replace and mention the Newley generated function and then the original function (using 2 equal signs for this one)
  • Or you could add if (variable we want to code) !=.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

CODE: Corr (variables )

A
  • Tells us the correlation
  • Adding “, cov” at the end tells us the covariance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How to save the new version of data

A
  1. Save (name of the file), replace
    - replace means that we are replacing the previous data
    - Never write the name auto as we do not want to overwrite the original data so give it a new data
  2. Under cd write: capture log close
    - If we are currently creating a log file close it and if there is not a log file, ignore that
  3. Log using (log name).log, replace text
    -Creating a log file in log form in which we can read. If a file is already with the same name we replace the text
  4. Log close (At the bottom)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Running a t -test in which we test if a variable is equal to a certain value

A

ttest (variable we are testing) ==0
- Provided with standard deviation (measure of spread of all values)
- Provided with standard error (measure of precision of the average across samples)
- Provided with t value (hypothesised mean minus observed mean divided by standard error)
- Provides confidence interval
- Ha: mean !=0: Two sided alternative, this then shows the probability of getting the test statistic or one more extreme. If it’s less than 0.05 than we reject the null hypothesis.
- Ha: mean < 0 is a one sided test assuming that the mean is positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

T-test for a difference in mean

A

Ttest (variable we are testing), by (variable we are separating to find the difference between groups. E.g. Binary variable)

Ttest for willingness to spend based of whether someone is employed or not
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Code: Reg (outcome variable, independent variables)

A

Constant is worked out through: Average value of outcome minus (coefficient estimate for the independent variable * average value of the independent variable )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Code: Predict (name of the new variable E.g. y_hat)

A
  • Needs to be run after a reg command
  • predict residual, resid: Needs to be underneath a reg command
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Show properties of regressors

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Code: twoway
Creating a graph

A
  • scatter (outcome variable regressor): Scatter graph
  • lift (outcome variable regressor): Regression line
  • graphregion(color(white)): Change background colour to white (needs to be after a comma)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Difference between 1 equal sign and 2 equal signs?

A

=: Assign this value to be equal
==: If equality already holds

26
Q

Omitted variable bias

A

Auxiliary regression: Reg (omitted variable) (regressor of interest)
If you then type local coef = _b[variable]/ it will save the value of the coefficient for the variable, then you have to run your code all together

27
Q

Saving in a document file
Code: outreg

A

Use reg to tell stata that we are creating a new document, therefore we type replace (Not typing replace adds on a new column, typically done when adding a new control)
Document tells us if something is significant through asterisk (*)
sdec (4): standard errors with 4 decimal places
Bdec (4): Beta to have 4 decimal places

28
Q

Hypothesis test for coefficent

A
29
Q

Code: Lincom

A

Write that the variable associated with the coefficient

30
Q

Code: test

A

Test in which the null hypothesis is requires multiple equal signs and there is the use of the word and
First need to run the regression to show stata which regression we are talking about
Want to tester multiple hypothesis at the same time.
Can use F-test for single hypothesis as-well but just a single bracket with the equal sign and what the hypothesised value is

31
Q

Standardising

A

Egen is standardising
R(sd) refers to the standard deviation from the previous sum command.
When running a reg and interpreting the coefficient, mention that the specific variable increasing by 1 standard deviation is equal to the outcome variable increasing by the coefficient holding fixed other variables.
We can also standardise the outcome.
Makes it easier to understand

32
Q

What is normalising?

A

Transforming the variable to make it easier to read

33
Q

Code: Egen

A

Creates a variable that is the average value
If instead of mean there was an sd, then it would be the standard deviation of them

34
Q

Code: Insheet

A

Insheet using name of data.csv, clear
- Open a new dataset
- Excel format therefore we need a cvs name

35
Q

Code: Split VARIABLE

A

Creating separate variables for the day month and year:
Rename: Change the name of a variable rename (original variable name) (new variable name)
- replace VARIABLE = regexr (variable, “what we want to take out”, “what we want to put in”)

36
Q

Code: Destring variable, replace

A

Command tells stata that a certain variable is actually a number
Used with split command

37
Q

When we browse, why are certain variables in red?

A

Stata thinks that this variable is just words and not numbers (E.g. Months )

38
Q

Changing months in numerical values

A

Add destring code at the end

39
Q

Code: Inrange

A

, if I range (variable, starting point number, ending point number)
Includes the first and last number

Regression which uses inrange
40
Q

Code: Inlist

A

For words

41
Q

What should your add at the end of any regression code?

A

,Robust to show the use of robust standard errors

42
Q

Binary variables i.

A

I.
Create a dummy variable for each possible outcome

43
Q

Code: xi

A

Stat also creates variables which you can browse

44
Q

Functional form

A
45
Q

Finding observation of percentiles

A
46
Q

Scalar name of constant =_b[coefficient name]

A
  • Display needs to be after the reg command but not scalar
47
Q

Egen panel_id= group (variable variable)

A

Gives each possible combination of each variable a specific ID code

48
Q

Xtset panelidvariable timeidvariable

A

Informs stata that we have panel data

49
Q

Diff-in-diff
Binary variable for treated

A

Gen variablename = Inlist (variable we are focusing on, values that we want our binary variable to equal 1 for)

50
Q

Diff and Diff
Binary variable for being after treated

A

Binary variable that is either 1 or 0 depending on if it’s after or before treatment

51
Q

Diff and Diff regression:

A

Reg outcomevariable treated after interaction, robust, if variable == “specific value/word that is the treated group”

52
Q

Diff and Diff
Generating the interaction term

A

Multiplying treated and control
Gen interaction = treated*after

53
Q

Diff and Diff
Graph for the trends before treatment

A

Two way (lift outcome binary variable(for time) if treated ==1 & after == 0 & cname == “Treated group”) (lift outcome binary variable(for time) if treated ==0 & after == 0 & cname == “Treated group”)

Trend in outcome for the treated group prior to treatment as well as the trend in prices for the control group

54
Q

Checking for parallel trends

A
55
Q

Fixed effect estimators

A
56
Q

Time fixed effect

A
57
Q

Creating fake data

A

Gen fake_regressor = rnormal ()
Distributed with mean 0 and standard deviation of 1

58
Q

Time fixed effects

A

Sort panel_id timevariable

59
Q

First difference regression

A
60
Q

Regression discontinuity

A

Image

61
Q

2SLS

A

Ivregress 2SLS price (treatment that might have bias = instruments) control variables, robust if product == “treated group”

62
Q

Panel Data Regression

A