Maths 2 Flashcards
Types of variables
Categorical/Qualitative: observation belongs to one of a set of categories (numeric or non-numeric)
Quantitative: observation takes numeric value representing magnitudes (how much or how many)
Discreet vs. Continuous
Discreet: possible values are operate numbers, finite, countable
Continuous: possible values form an interval, infinite, measurable
Level of measurement
Non-metric
Nominal: categorical, can be distinguished (e.g.: first name)
Ordinal: qualitative, can be ordered, distinguish and compare (e.g.: grades)
Metric:
Interval: quantitative, zero point is arbituary, distinguish, compare and meaningful difference (e.g.: temperature)
Ratio: zero point is absolute, distinguish, compare, meaningful differences and ratios (e.g.: income)
Measures of central tendency and dispersion
Central tendency: mean, mode, median, percentiles
Dispersion: deviation, variance, standard deviation, Range and IQR
Interpret frequency distributions
- Shape (peaks, skewed), centre (central tendency), spread (dispersion)
- Look for outliers (striking deviations from the pattern)
- Make statements
Analysis single variables
Nominal: frequency tab, bar chart, mode
Ordinal: frequency tab, bar chart, mode, median, range, IQR
Metric: frequency tab, bar chart, median, mean, Std. Dev., range, IQR
Crosstables
Describe joint distribution of two variables
For all level of measurements
Illustrate absolut and relative frequency (also conditional frequencies)
Scatterplots
Quantitative variables
plotted symbols illustrate combinations of values (of the two variables)
Correlation easily visible
Covariance and Correlation
Covariance
Positive –> positive correlation between variables
Negative –> negative correlation
Correlation
[0.1;0.6) –> weak positive
[0.6;1] –> strong positive
…
Pearson: two metric variables
Spearman: one or both ordinal variables
OLS
Ordinary least squares –> linear regression
Residual: difference between observed value and fitted (predicted value)
Best line minimises sum of the squared distance
Goodness of Fit
How good the regression line fits the data
R^2 falls between 0 and 1 –> the higher, the better the fit
Definitions: probability, random experiment, random variable, probability distribution
Probability: relative possibility that an event occurs
Random experiment: process leading to occurrence of one of all possible outcomes
Random variable: variable whose value is a numerical outcome of a random phenomenon
Probability distribution: function that relates each value of random variable to its probability
Binomial Distribution
Fixed number n of observations
Two possible outcomes
Probability remains the same for each observation
Normal distribution
n approaches infinity –> probability of a specific event turns zero
Area under the curve corresponding to an interval is probability that variable assumes a value within this interval
Population, Sample, Sampling methods, Finite vs infinite
Population: set of all elements of interest of a study
Sample: subset of the population
Simple random sampling, stratified random sampling, cluster sampling, systematic sampling, convenience sampling, judgemental sampling