Inferential stat Flashcards

1
Q

Normal distribution:

A
  • Mean = median = mode
    • Mean pushes the graph to the left or right
      Lower standard deviation: taller peak, thinner tailes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Standard normal distribution:

A

Også kendt som Z-fordelingen

Mean = 0
Standard deviation = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

To standardize any normal distribution:

A

Z = (x - mean) / standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Central limit theorem:

A

No matter the underlying distribution, the sampling distribution approximates a normal. For CLT to apply, we need at least 30 observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Standard error:

A

Measures the accuracy with which a sample distribution represents a population by using standard deviation. (Bruges da der er usikkerhed forbundet med stikprøven.)

Standard-fejlen har en præcis værdi, der afhænger af andelen i populationen. I praksis kender vi ikke p (populationsandelen). Så i stedet estimerer vi standard-fejlen på baggrund af den estimerede stikprøveandel p^

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Alpha and confidence level:

A

Confidence level = 1 - alpha
For a confidence level of 95%, alpha is 5%. This means that there is a 2,5%(=alpha/2) that the mean is lower than the lower bound and 2,5% it is higher than the higher bound.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

T-statistic:

A
  • Works for smaller sample sizes and when population’s standard deviation is unkown (so we use the sample standard deviation, hence the fatter tails i.e. more uncertainty).
    • Follows a Student’s T-distribution: It looks like a normal distribution but have fatter tails.
    • Have degrees of freedom. Usually we use n-1 for degrees of freedom. Ex.: For 20 observations, we use 19 degrees of freedom. (However if we have two variables: n-2 etc.)
    • The degrees of freedom and alpha determines the t-score.
      After 30 degrees of freedom, the t-table is almost identical to the Z-table.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Margin of error:

A
  • is the formula for CI. So, the confidence interval for the mean is the mean +- the margin of error. Therefore, a smaller margin of error is a narrower CI.
    To get a smaller margin of error: reduce Z or t statistic or standard deviation. Increase sample size.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Hvornår accepterer man og forkaster baseret på Z-score?

A

If the calculated Z-score is between -1,96 and 1,96 we accept the null hypothesis (for a significane level of 0,05 and two-side test). If the Z-score is in the rejection region, we reject.

For a one sided test, the Z-score will have to be lower than -1,65 before we can reject the null hypothesis. If the null hypothesis is: mean > 300, the rejection region will be on the left hand side of the Z-distribution.

When testing a hypothesis, you compare the Z-score (calculated value from sample) with the critical value z.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

P-value:

A

The smallest level of significane at which we can still reject the null hypothesis.

Reject null hypothesis if p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Forskellen mellem formlen for Z- og t-statistik

A

The only difference in the formula for CI for Z-statistics and t-statistics is that we use t-score and sample standard deviation instead of Z-score and population standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Hvilke tests eksisterer og hvad bruges de til?

A

Alle forklarende variable er kategori.

Z-test: To andele, responsvariabel: kategori

T-test: To middelværdier, kvantitativ

Chi^2-test: To eller flere andele, kategori

ANOVA-test: To eller flere middelværdier, kvantitativ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Hvad er type 1 og type 2 fejl?

A

Type1-fejl: En sand nulhypotese forkastes
α er lig med risikoen for et begå en type1-fejl
P (Type1-fejl) = α

Type2-fejl: En forkert nulhypotese accepteres
Risikoen for at begå en type2-fejl kaldes beta
P (Type2-fejl) = Beta

En reduktion i risikoen for den ene fejltype gør at risikoen for den anden øges. Eneste mulighed for at sænke begge typer risici er at inddrage mere information, altså en større stikprøve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

5 trin i hypotesetest

A

Antagelser
Hypotese: Nulhypotese (Ho) og alternativ hypotese Ha
Teststørrelse
t-statistik (t-score)
z-statistik (z-score)
P-værdi
Sandsynligheden for at nulhypotesen kan forkastes
Konklusion
Hypotese og P-værdi sammenfattes til en konklusion som kan formidles til ikke-statistikere

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Styrke for hypotesetesten

A

Styrke = 1–P(Type2-fejl error) = sandsynligheden for at forkaste nulhypotesen, når den er falsk

Des højere styrke des bedre test
I praksis er det ideelt, hvis hypotesetesten har høj styrke og relativ lille α (signifikansniveau)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Antagelser:

A

Z-test:
Observationer er uafhængige.
Mindst 15 succeser og fiasko.

t-test:
Observationer er uafhængige, randomiseret og approksimativt normalfordelte.
N skal være større end 30.

Chi^2-test:
Observationerne er uafhængige.
Det forventede antal observationer I hver af krydstabellens celler er større end 5.

ANOVA-test:
Normalfordelt inden for hver gruppe.
Ens standardafvigelser.

17
Q

F-test:

A

ANOVA

F = variation mellem grupper / variation inden for grupper

Følger F-fordelingen.

Ud fra teststørrelsens forhold mellem tæller og nævner ses det at jo større variationen mellem grupper er i forhold til variationen inden for

Jo større værdi af F-teststørrelsen des større bevis mod H0.