02_Basic statistic characteristics Flashcards

1
Q

What is descriptive, inductive statistics and hypothesis testing?

A

descriptive: distinction of different scale levels and understanding of respective analysis constraints + calculation of different measures

Inductive: Concept of sampling error, fundamental characteristics of theoretical distributions, estimating and testing

hypothesis testing: univariate and bivariate parametric and non-parametric statistical tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two things needed to analyze a relationship?

A

operationalization: represents the development of scales for measuring characteristic values of a particular concept/variable

Scale of measurement: defines the mathematical characteristics of a scale and thereby of the data to be gathered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the 4 measurement scales?

A
  1. Nominal scale: Assignment of objects to categories (Sex: male, female)
  2. Ordinal scale: Ranking (counting and ordering,
  3. Interval scale: constant units –>inferences about distance (no natural zero point)
  4. Ratio scale: constant units, fixed and multiplications possible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the likert scale?

A

Likert scale: ordinal scale with mostly 5 to 7 scale points
(from Fully disagree to fully agree)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a quasi-metric ordinal scale=

A

Ordinal scale with the assumption: equal distances between scale points, –>treated just like Interrval scale (5 to 7 scale points, so that measures such a mean and variance are meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a percentile?

A

Percentiles are generalizations of the median: observations are arranged according to their size and a percentile divides them in two groups.

–>The pth percentile: value such that p percent of the observations fall at or below

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the mode?

A

= the value that *most frequently occurs in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interpretation of standard deviation

A

Standard deviation measures the amount of variation

low value: data points close to mean, mean is informative
high value: data points further away from the mean, not informative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the intuition behind the “coefficent of variation?

A

coefficient of variation:is a measure that expresses the relative variability of a set of data points compared to their mean (average)

–>independent of scale of the data, thus makes comparision between two variable on different scals possible

When comparing CVs, asmaller value implies greater consistency relative to the mean, while a larger value implies greater variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the meaning of Skewness?

A

Measures for the symmetry of a distribution

–>symmetric Skewness=0
”< 0 –>left skewed
>0 –>right-skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the sampling error?
(Why do we have one)

A

By taking samples from a population, we have uncertainity because there are different samples possible

Sampling error: provideds information about the standard deviation of a variable when drawing several sample of the size n

Standard error= 0.17 –>If several samples would be drawn, the standard deviation of their mean/variable x would be 0.17

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Confidence intervall interpretation
CI (95%) = (1.6, 2.26)

A

Confidence intervall: (1-alpha) probability that the true parameter lies within the confidence interval

Ex: average CS is between 1.6 and 2.26 in 95% of repeated samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the margin of error?

A

Margin of error= 1/2 of the confidence intervall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the confidence intervall depent on?

A
  1. significance level (1-alpha): larger –>decreases the CI
  2. Sample size:
    - Larger –>lower standard error–>decreasing CI
    - Smaller –>higher standard error–>increasing CI

3.Standard deviation:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the intuition behind the H0 and H1 hypothesis?

A

H0: observered result, completly explained by standard error (chance)
H1: accounting for standard error, the results are still significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the two error types?

A

Alpha-error (type 1): Reject H(0) even though H0 is true

Beta-error: (type 2): Don´t rejects H0, but H0 is not true –>Statistical power

17
Q

cal pow

What is the statistical power?

A

The statistical power (1-beta) of a significance test is the long-term probability, given the population effect size, significance criterion and sample size of rejection of H0.

–>Statistical Power: The probability that a statistical test will correctly reject a false null hypothesis, or in other words, the probability of detecting a true effec

18
Q

What are the three drivers of statistical power?

A

drivers of statistical power:
- Effect size expressed in alternative hypothesis: stronger effects are easier to detect
- Chosen significance level: decrease in alpha error decreases statistical power (1-beta)
- Sample size: larger n increaases power of the test (1-beta)

19
Q

Possilble outcomes: Test and interpretion –>No rejection of H0 (related to statistical power)

A

No rejection of H0
high statistical power:
- Evidence for H0
- Refutation of the substantial testing hypothesis (H1)
–>the study successfully avoids making a Type I error (false positive) by not incorrectly rejecting a true null hypothesis

low statistical power:
- Inconclusive status, neither support for H1 nor for H0
- Danger of seemingly contracdictory research findings –>type 2 error

20
Q

Possible outcomes: Test and Interpretation: rejection of H0 (high/low statistical power)

A

Rejection of H0:

high statistical power:
- Danger that very small effects will be statistically significant
- Practical relevance of the finndings need to be established

Low statistical power:
- support for H1

21
Q

What does the MAD measure?

A

The MAD (Mean Absolute Deviation) = average of the absolute deviation from a measure of certainty (mean or median)

MAD from the mean is never smaller than the mean absolute deviation from the median

22
Q

What are the 3 assumptions of the t-test for two populations means?

A

T- test for two population means

Assumptions:
- independent samples
- both variables are normally distributed in the population
- the variance of observed variables are equal

23
Q

When can we use the paired t-test?

A

When data is not independent —> two related groups or repeated measuring

cannot be used with aggregated data

24
Q

For the chi square test of goodness of fit, what is the prerequisite and the ho and h1

A

—> are preferences of males equally distributed across all 3 functions ( one sample)

For every category i = n*p >= 5 if not

merge adjacent categories
ignore corresponding category ( if merge not possible)