M8 - Flashcards
Whats correlation?
The association between two quantitative variables.
perfect positive correlation = 1
no correlation = 0
Cenral limit Theorem
- def
- implication
- The distribution of mean values of samples of the size N that were drwan from the population converges with increasing N towards a normal distribution
- For samples of a size of above N =30, probabilities can be quantifies as estimates of the mean
Estimates of the mean - sigma
- with a probability of 68% the mean of a random sample is in the range
müh +-sigma
(müh - mean value in the population, sigma = standard error of the sample mean) - with a probability of 95,5% the mean of the sample is in the range of
2müh +-sigma
Standardization
-when is it standardized?
-
- a random varable is called standardized when its expected value is 0 and its variance is 1.
- define a new random variable Sn=x1+x2+xn
- expected value of Sn: mühn; variance ns²
Hypotheses
- def
- criteria for sctientific hypotheses
- are assumptions about structral properties among phenomena
they point beyond an individual situationa nd can be disproved by empirical data - general validity
- falsifiable
- phrased as conditional clause/ equality/inequality
general validity
is the extent to which a concept, conclusion or measurement is well-founded and corresponds accurately to the real world.
universal proposition / generalizable
falsifable
it must be possible to imagine events that are in conflict with conditional clauses
Null hypothesis
Alternative hypothesis
H0 contains the statement to be disproved
–> typically claims “there is no difference / effect” –> a statement that can be falsified by finding a significant difference
H1 claims the opposite
–> typically “there is a difference/effect” –> cannot be falsifed since a non-finding of a difference may be due to the fact that the difference was too small
z-test
- def
- distribution of z
- what if p-value is less than the given level of sign
compare sample mean with pop mean
z-test: tells us whether the bcoeff for that IV is sign different from zero
- -> if b is different from zero, then this IV is making a sign contribution to the prediction of the outcome.
- normally distributed
- for N > 30
- KNOWN SD
- z is normally distributed
- p-value is less than the given level of sign., H0 must therefore be rejected
one-tailed hypothesis
two-tailed hypothesis
- H1 unequal 10; its either smaller or bigger, may be both
- H1 avg > 20; its bigger but not smaller
difference one and two-tailed tests
- p-value
z-value
ONE: compares the sample mean with the pop mean
TWO: compares 2 (in)dependent samples
- with equal z-values the critical p-value of the on-sd-t. is half as large as the one of the two-s-t.
- with equal p-values, the critial z-value o the one-s-t is less than the one of the two-s-t
t-test
- purpose?
- p-value
ONE: compares sample mean with pop mean
TWO: compares (in)dependent samples
- for N < 30
- varibles normally distributed
- almot similar variation
- UNKOWN SD!
- p-value states at which significance level we can reject the null hypothesis
t-test: know how well the model fits the data and the contribution of individual predictors
–> linear regression