02_Basic statistic characteristics Flashcards
What is descriptive, inductive statistics and hypothesis testing?
descriptive: distinction of different scale levels and understanding of respective analysis constraints + calculation of different measures
Inductive: Concept of sampling error, fundamental characteristics of theoretical distributions, estimating and testing
hypothesis testing: univariate and bivariate parametric and non-parametric statistical tests
What are the two things needed to analyze a relationship?
operationalization: represents the development of scales for measuring characteristic values of a particular concept/variable
Scale of measurement: defines the mathematical characteristics of a scale and thereby of the data to be gathered
what are the 4 measurement scales?
- Nominal scale: Assignment of objects to categories (Sex: male, female)
- Ordinal scale: Ranking (counting and ordering,
- Interval scale: constant units –>inferences about distance (no natural zero point)
- Ratio scale: constant units, fixed and multiplications possible
What is the likert scale?
Likert scale: ordinal scale with mostly 5 to 7 scale points
(from Fully disagree to fully agree)
What is a quasi-metric ordinal scale=
Ordinal scale with the assumption: equal distances between scale points, –>treated just like Interrval scale (5 to 7 scale points, so that measures such a mean and variance are meaningful
What is a percentile?
Percentiles are generalizations of the median: observations are arranged according to their size and a percentile divides them in two groups.
–>The pth percentile: value such that p percent of the observations fall at or below
What is the mode?
= the value that *most frequently occurs in a data set
Interpretation of standard deviation
Standard deviation measures the amount of variation
low value: data points close to mean, mean is informative
high value: data points further away from the mean, not informative
What is the intuition behind the “coefficent of variation?
coefficient of variation:is a measure that expresses the relative variability of a set of data points compared to their mean (average)
–>independent of scale of the data, thus makes comparision between two variable on different scals possible
When comparing CVs, asmaller value implies greater consistency relative to the mean, while a larger value implies greater variability
What is the meaning of Skewness?
Measures for the symmetry of a distribution
–>symmetric Skewness=0
”< 0 –>left skewed
>0 –>right-skewed
What is the sampling error?
(Why do we have one)
By taking samples from a population, we have uncertainity because there are different samples possible
Sampling error: provideds information about the standard deviation of a variable when drawing several sample of the size n
Standard error= 0.17 –>If several samples would be drawn, the standard deviation of their mean/variable x would be 0.17
Confidence intervall interpretation
CI (95%) = (1.6, 2.26)
Confidence intervall: (1-alpha) probability that the true parameter lies within the confidence interval
Ex: average CS is between 1.6 and 2.26 in 95% of repeated samples
What is the margin of error?
Margin of error= 1/2 of the confidence intervall
What does the confidence intervall depent on?
- significance level (1-alpha): larger –>decreases the CI
-
Sample size:
- Larger –>lower standard error–>decreasing CI
- Smaller –>higher standard error–>increasing CI
3.Standard deviation:
What is the intuition behind the H0 and H1 hypothesis?
H0: observered result, completly explained by standard error (chance)
H1: accounting for standard error, the results are still significant