Week 4 Flashcards
Descriptive vs Inferential statistics
Descriptive statistics describe a sample of population through specific measures: mean, mode, variance
Inferential statistics infer the properties of a population through measures calculated on a sample population.
3 measures of central tendency measurements in descriptive statistics
Mean, median, mode
Mean vs median vs mode
Mean: mathematical center of sample
Mode: most frequently occurring value in sample
Median: value occurring in the center of an ordered sample
3 measures of variability in descriptive statistics
Range, variance, std deviation
Range vs variance vs std deviation
Range: Max-min
Variance: ∑(ni - x̅) / N
Std dev: sqrt(variance)
T-tests
Aka student’s test is a parametric inferential test that compares if there is significant difference between the means of 2 groups and describing there difference
ANOVA
Analysis of variance, allows for testing significant difference of means in more than 2 samples
- Extension of t-tests
- Samples have normal distribution
- samples are random and independent
- Each group has common variance
- Data are independent
When is regression analysis used?
It is used to find the relationship between a set of variables in a data set
Dependent vs independent variable in linear regression?
Dependent variable: variable that is being predicted, aka “response variable”, or “outcome variable”
Independent variable: aka “explanatory variable”, “predictor variable”, the variable that is said to influence the dependent variable usually labeled X
ie How does the number of hours studied (X: predictor) affect the student’s test score (Y: response)?
Which plot is useful for visualizing linear trends in data sets.
Scatter plot
What is the ordinary least squares equation and what is it used for?
The least squares equation is used to fit a data set to a line with a given slope (m in y = mx + b), estimating the unknown values of a model on the line.
The equation is defined as :
m = ∑(x - x̄) * (y - ȳ)
——————-
∑ (x - x̄) ²