Data and Sampling Flashcards

1
Q

What are data?

A

Observations/measurements of some phenomenon of interest
A dataset is a collection of realisations of random variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What can the index of a data point indicate?

A

The set for cross sectional data, the time for time series or panel data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the types of data?

A

Numeric, grouped, or categorical
Data sets can be cross-sectional, time series, or some combination (panel data tracks a group over time, repeated cross-sectional data take different random samples)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What can be included in summary statistics?

A

Measures of central tendency, measures of dispersion, other measurements of the distribution, measures of relationship between the variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What measures of central tendency are there?

A

Mean, median (50th percentile), mode, geometric mean (nth root of the product of the data, doesn’t work with negative data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What measures of dispersion are there?

A

Standard deviation, range, interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some other measures of a sample distribution?

A

Sample skewness measures asymmetry of a distribution, sample kurtosis measures ‘tailedness’ or ‘peakness’ or how much of the variability is due to large deviations from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What measures of relationship between variables are there?

A

Sample covariance which is positive if high x values are associated with high y values and 0 if there is no linear relationship, correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the formula for the population variance?

A

σn2 = 1/n * Σi=1n(xi-x̄)2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the formula for the sample variance?

A

sn-12 = 1/(n-1) * Σi=1n(xi-x̄)2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the formula for sample skewness?

A

(1/n * Σi=1n(xi-x̄)3) / sn3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the formula for sample kurtosis?

A

(1/n * Σi=1n(xi-x̄)4) / sn4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is one common reason why correlation does not equal causation?

A

There can be a third variable causally correlated with both variables, often time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a sample survey?

A

A fraction of the total population observed to make statistical inferences about the population from which they are drawn (instead of using a census)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why are samples are needed?

A

To estimate parameters for the probability distribution used in the analysis of events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the effect of an unrepresentative or non-random sample?

A

Estimates based on it aren’t useful
Can’t make population generalisations

17
Q

What is random sampling and why is it not always done?

A

Create a list of all the possible units that could be sampled and then use a random number generator to select from the list
This isn’t always feasible (not always able to list every unit)

18
Q

When is stratified sampling used?

A

When the population consists of identifiable sub-groups

19
Q

How can sampling uncertainty be reduced?

A

Taking averages from random samples, use larger samples

20
Q

What is the tradeoff to sampling?

A

Larger samples improve sampling uncertainty and precision but are more expensive

21
Q

What are some issues with the data used for microeconomic analysis?

A

Confidentiality, non-response can be systematic which introduces bias, attrition can affect cohort studies with survivorship bias

22
Q

What makes a sample random?

A

It is a collection of IID random variables

23
Q

What is an observation?

A

A known value that each collected random variable assumes

24
Q

What is a statistic?

A

A function of the sample which is a random variable and so has a sampling distribution

25
Q

What is the distribution of the sample mean of observations of IID normal random variables?

A

X̄ ~ N(μ, σ2/n)
Uses property that sum of normal distributions is a normal distribution