1. Introduction, Frequentist inference Flashcards

1
Q

What is descriptive statistics/algorithmic

A
  • Characterising datasets
  • Dataset statistics:
    Mean, standard deviation, median,
    etc.
  • Reveals facts about in-sample
    distribution
  • “Is this what I expect to see?”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is inductive statistics/inferential

A
  • Uses samples to draw conclusions
    about populations
  • Reveals probable facts about
    out-of-sample distribution
  • We have the sample (the data) and we want to know about the population.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Statistical inquiry is usually motivated by? What category are there?

A

A business question:
- Prediction: regression
- Decision problems: hypothesis testing
- Experimental design: not talked about in the class too much

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Statistical model contains:

A

Assumption about a distribution with certain parameters
Some kind assumption about the function

Data and evaluation are outside the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain a experiment and event:

A
  • An experiment is any process, real or hypothetical, in which the possible outcomes can
    be identified ahead of time.
  • An event is a well-defined set of possible outcomes of the experiment.

Throwing a dice is an experiment, the cast of 1 in the dice would be an event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sample space:

A

The sample space is the set of all possible outcomes. Casting one dice would have a sample space of 6 different outcomes.
An event is a subset of the sample space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Random variable

A

A real-valued function X : S 7→ R defined on the sample space.
Example: the sum of a dice. Casting two dice with one eye = 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do we want to know about random variables?

A

Full specification:
- What values can they assume, and how likely are they?
We are aiming to predict the distribution

Descriptive statistics
- What is their value typically?
Central tendency (where does the prob. tend to cluster)
- How certain are we about this typical value?
We are aiming yo see how “spread out” are the values?
Dispersion

Central tendency : what is it going to be
Dispersion : how certain are we that it is going to be that

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Central tendency: Expectation

A

The expectation or mean is the sum of all possible values of a random variable,
weighted by their probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Dispersion: Variance and standard deviation

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How many parameters do we need for a categorical choice between 10 alternatives?

A

Each choice needs a parameter so 10 (said in class)

Christian said: you can get away with 9, because you need to sum them up to 1?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Parameters

A

A characteristic or combination of characteristics that determine the joint
distribution for the random variables of interest. Fx for the normal dist. the mean is set at

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Are parameters random variables?

A

Frequentist would say no
Bayesian would say yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What questions to ask when making a statistcal model:

A

Scope: What do we want to model? Identify random variables of interest
Structure: How does it hang together? specify of joint dist. of random variables
Parameters: How can we fit it? Identification of parameters of distribution assumed unknow
Optional (bayesian): Specifiction of joint dist for unknow paratemers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the frequentist approach to statistics assume?

A
  • experiments are infinitely repeatable, and
  • the underlying parameters remain constant.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In frequentisme probabilities can then be estimated as:

A

relative frequencies in a long run of experiments.

16
Q

What is typical quantity of interest is the standard error of an estimate:

A

This is the standard deviation of the sampling distribution of a statistic.

17
Q

If we sample over and over again, we will get the sample dist, and the standard deviation of the sampling dist will tell us how far we are from the real dist.

A
18
Q

WHat is theta

A

A constant thing, a number we dont know, but we want to know

  • A number that we cant know, we want to know

Constant

19
Q

What is theta hat

A

Some sort of sample statistic (quatative that we have computed, any value we can compute) that we use as an estimator for theta

  • This is a number can know, close to theta

Constant

20
Q

What is captial theta hat

A

The estimator, and a random variable

A function of the random variable that gives us from the sample data. The estimator of theta hat is a random variable

21
Q
A

In bayesian statistic the parameter is a random variable itself

22
Q

What does bias tell us? Write it out

A

Bias: How far off is the estimator on average?

We look at the overall thing

23
Q

What does variance tell us? Write it out

A

Variance: What is the spread of our estimates?

This is more like in terms of the enkelte data??

24
Q

Explain frequentism in practice

A

The probabilistic properties of a procedure of interest are derived and then applied
verbatim to the procedure’s output for the observed data.

25
Q

What is problem about frequentism?

A

In deriving the inference procedures, we assume we know the data distribution F, but in reality we don’t.

26
Q

What is the Plug-in principle? (not required to know in detail)

A

The frequentist accuracy estimate of the data is itself estimated from the observed data. We use information we have already about the coin observed to fx calculate the standard error about the coin?

27
Q

What is the delta method and Taylor-series approximations

A

If you have some results about your parameters and you apply a function,

If you fo certain things, what is the outcome of your random variable going to be like?

The transformation does: we know something about the standard error, and you apply a transformation the delta method approximate the straight line, which is the derivative, and this tell you that if you initial value has a certain dist. if the transformation is very steep, the corresponding value is going to have a larger dist. (IM SO SORRY)

TAKE AWAY: This is a type of reasoning the frequentists would apply.

28
Q

What is pivotal statistics

A

This is one of whose distribution does not depend in the underlying distribution of F. We can normalise the t-test to show it is independent of the sigma in the t-test. This means we remove the problem of knowing the underlying distribution.

28
Q

What steps do we take when we do simulation and the bootstrap

A
  • Modern computer power allows us to simulate the “infinite” sequence of
    experiments numerically. (this was a bold assumption back in the days, when they did it like 5 times by hand)
  • Create B new bootstrap samples from your existing data sample by resampling
    with replacement.
  • Run the inference procedure for each of the bootstrap samples.
  • Draw conclusions from the empirical distribution of the estimates.
29
Q

What is the problem with these methods? (plug-in, taylor-series, bootstrap, pivotal

A

There is some cases where there methods are just not applicable.

30
Q

Frequentist theory show;

A

That certain procedures are asymptotically optimal
under certain assumptions.
- In parametric probability models, the maximum-likelihood estimate
is the optimum estimate in terms of asymptotically minimum standard error.
- Neyman-Pearson lemma:
The likelihood ratio test is uniformly most powerful (has lowest type-II error)
among hypothesis tests with a given type-I error rate.

This is nice because you get the guidelines on what to do under these assumptions.