439 midterm Flashcards
population & parameter
- pop: the entire set of things of interest
- par: A property or number descriptive of the population (a fixed number, but in practice, we do not know its value)
sample & statistic
- sample: A part of the population. Typically, this provides the data that we will examine to gather information
- stat/estimate: A property or number that describes a sample (use a statistic to estimate an unknown
parameter)
descriptive & inferential statistics
- descriptive: Summarize/describe the properties of samples (or populations when they are completely known)
- inferential: Draw conclusions/make inferences about the properties of populations from sample data
types of variables
- nominal (classifies/identifies objects, can be dichotomous or multi-categorical) and ordinal (ranking data): categorical (discrete/qualitative)
- interval (rating data with equal distances) and ratio (Special kind of interval scale with a meaningful zero point): continuous (numerical/quantitative)
univariate and multivariate
- uni: one DV, can have multiple IVs (linear, logistic regression)
- multi: multiple DVs regardless of the number of IVs (dimension reduction, cluster analysis)
normal distribution
- Y is continuous and normally distributed in the population
- mean = median = mode
- Y~ N(mean, SD)
- 68% of scores within 1SD of the mean, 95% within 2SDs, 99.7% within 3SDs
z (standard) score
- we can convert Y scores to z scores that follow the standard normal distribution (z ~ N(0,1))
- deviation of a sample score from population mean divided by the standard deviation of the population (to limit the deviation)
- to determine how extreme any score is based on standard normal distribution
types of statistical inference
- significance tests (computing a p value)
- confidence intervals
- both types are based on sampling distributions of statistics
sampling distribution of statistics
- the distribution of the values taken by the statistic in all possible samples of size N from the same population
- distribution of the statistic values (like a mean) in all possible sample of size N from the same population
- all these possible samples will follow a normal distribution if your population is also normal
- if the pop is not normally distributed, we use the central limit theorem
central limit theorem
- as N increases, the distribution of a sample mean becomes closer to a normal distribution. This is true no matter how the population is distributed, as long as it has mean μ and standard deviation σ
- X (mean of sample) ~ N(μ, σ/√N)
Null Hypothesis Significance Testing
- State the null and alternative hypotheses (assumptions made about a parameter, we test the hypothesis of no effect if we think there is an effect)
- Calculate the value of an appropriate test statistic (how far are the data from the null–for one parameter in the null, we use a t-test, for more than one we use a F-test, for frequency distributions we use chi-square)
- Find the p-value for the observed data (the test statistic value).
- State a conclusion (significance level alpha decides the area of extreme scores that would be unlikely if the null is true, the cutoff value of the statistic based on alpha is the critical value)
what is a p-value
- a conditional probability (probability based on the condition that the null is true)
- how likely the test stat you computed is if the null is true
- the smaller the value, the less compatibility between your data and the null (support for the alternative)
- if the p-value is smaller than the alpha level, there is a statistically significant effect and we reject the null)
- just because we have a statistically significant effect doesn’t mean it’s meaningful–a p-value is highly affected by sample size
effect sizes
- magnitude of a treatment effect
- PEarson’s r, correlation squared (R2), cohen’s d, omega or omega squared
- small effect: r = 0.1, r2 = 0.01, d = 0.25
- medium: r = 0.3, r2 = 0.06, d = 0.5
- large effect: r = 0.5, r2 = 0.15, d = 0.8
types of errors in NHST
- Type I: reject the null when it is true (false positive) = alpha level
- Type II: fail to reject the null when it is false (false negative) = beta level
- 1 - beta = probability of correctly rejecting a false null
- alpha and beta are related to each other (increase alpha = increase power and Type I error rate = decrease beta and Type II error rate)
z-test
- purpose: to test whether a sample mean differs from a population mean
- assumption 1: the population is normally distributed
- assumption 2: the population’s SD is known
- assumption 3: independence of observations (simple random sample)
- if your z-score is greater (in absolute value) that 1.96, we reject the null at alpha = .05
- limitation: knowing the SD of the population
t-test
- used when we don’t know the pop SD, but want to test whether a sample mean differs from a population mean
- assumptions: population is normal, independence of observations
- you use the sample SD in the formula, so the sample doesn’t follow a normal distribution, but the t distribution
- t distribution changes shape based on N (with a large enough N (df>30), the t distribution approximates the standard normal distribution, so you can use critical value 1.96)
- t distribution has fatter tails than normal, but still bell-shaped and symmetrical
- if t is large, either the numerator is large or the denominator is small (large N = large t statistic even if your numerator is small)
important p-value information
- a p-value doesn’t give you any information about the null
- it means that under the null (condition), you have this likelihood of getting a test statistic this extreme given your sample data
- = prob(your data if H0 is true) NOT prob(H0 is true given your data)
- you’re not testing whether the null is true
- p-value doesn’t give any information about the size or importance of your effect
- p-values indicate how incompatible your data are with a statistical model as long as the underlying assumptions hold
- p-values should not be the deciding factor in making a conclusion, should not be reported selectively
p-hacking
- trying different statistical methods until p<.05
- conducting multiple tests for subsets of samples or controlling or different covariates
- collecting more data until p<.05
- excluding some observations
- dropping off one of the conditions
scatterplot
- displays the form (linear, nonlinear), direction (positive, negative), strength (weak, strong) of a relationship between two quantitative variables measured on the same individual
- we usually need a numerical measure to supplement the graph (a correlation)
correlation
- measures the
direction and strength of the linear
relationship between two quantitative
variables (Pearson correlation coefficient r) - a correlation treats both variables as equals (shows a symmetric linear relationship)
pearson r
- standardized covariance
- covariance indicates the degree to which to variables vary together, but is not meaningful because it is state dependent
- so standardized covariance is between -1 and 1 so we can compare the relationships between variables (divide by standard deviations of the variables)
covariance
- the degree to which X and Y vary together
- positive Cov = moving in the same direction, negative = moving in opposite directions, 0 = independent variables
when should we not use r?
- if two variables have a nonlinear relationship (there could be a nnonlinear association, but r won’t pick up on it)
- if observations aren’t independent (there will be existing correlations between them)
- if there are outliers (very sensitive to outliers, will pull the line)
- if homoscedasticity is violated (if one variable has unequal variability across the range of the other variable)
- if the sample size is very small (N=3-6)
- if both variables are not continuous
other correlation coefficients
- point-biserial: binary & continuous variables
- phi coefficient: two binary variables
- Spearman rank order: two ordinal variables (like Judges’ ranks)
- Kendall’s tau: two ordinal variables (but N is small and there are many tied ranks)
why could two variables be correlated
- by chance
- two variables could be mutually affecting each other (price and demand)
- relationship could be driven by an underlying cause (a confounder)
lurking factors
- potential causes for a relationship that aren’t measured
partial correlation
relationship between 2 variables after removing the influence of another variable
statistical test for a correlation coefficient
- H0: p = 0, H1: p =/= 0
- use a t-test (one parameter in the null) when two variables are jointly normal (bivariate normality test Shapiro-Wilk)
- t = (sample stat - pop parameter)/sample SD (s/sqrtN)
linear regression
- used if we have a directional hypothesis (how X affects Y), can show an asymmetric linear relationship between predictor and outcome variables
- the effect of X on Y is beta (regression coefficient/slope)
- there will also be error that accounts for some variation in Y (not just X)
simple linear regression
- only one X (how DV changes when IV changes)
- assumption of linearity
- used as a mathematical model summarizing the relationship by fitting a straight line/regression line (Ŷ) to the data that predicts values of Y based on X (Ŷ = alpha + X(beta))
interpret alpha and beta for simple linear regression
- alpha: intercept (average value of Y when X = 0)
- beta: slope (amount by which Y changes on average when X changes by one unit)
method of least squares
- finding the best fitting regression line
- minimizing the vertical distance between a data point and the line (minimizing the residuals Yi - Ŷi)
- compute all the residuals for all data points, square them, sum the squares
- we square the residuals to avoid the negative and positive residuals cancelling out to zero
- this method is problematic when you have outliers because squaring large values makes them larger
- B = cov(x,Y)/var(X)
- alpha = mean of Y - estimate of B(mean of X)
statistical significance of the slope
- H0: B=0, H1: B=/=0
- use a t-test
- assumptions of normality and independence
- df = N-2 (two variables)
- t = (stat-parameter)/(SE(B))
- if t>CV or p<.05, the slope is significantly different from 0 which suggests a significant effect of X on Y
partitioning of variance in simple linear regression for getting the F ratio
- SS(Total) = SS(Regression) + SS(Error)
- SS(Regression): The variation in Y explained by the regression line.
- SS(Error): The variation in Y unexplained by the regression line (residuals).
- df(Reg) # of IVs in the model (=1)
- df(E) = N-2
- divide SS by corresponding df = MS
- divide MS(Reg)/MS(E) = F
- compare F ratio to F distribution using df(reg) and df(E)
how are t and F related in simple linear regression
- they’re testing the same thing ONLY in simple linear reg
- t^2 = F
goodness-of-fit of the regression model
- coefficient of determination (R2)
- proportion of variation in Y accounted for by the model (R2 = SS(Reg)/SS(T)
- ranges between 0 and 1
- in simple linear reg, r = sqrt(R2)
simple linear regression in JASP
- model summary gives R2
- ANOVA is the significance of regression coefficients
- coefficients: unstandardized are the estimates (divide them by SE = t)
- in simple linear regression, both ANOVA table and t values in coefficients are testing the same thing so you get the same conclusion about whether to reject the null
multiple linear regression
- one DV with multiple IVs (which IV matters most for the DV?)
- Describes how the DV (Y) changes as
multiple IVs (XJ) (J ≥ 2) change - results in 3D data points on a graph (along a regression plane)
- equation of regression plane: Ŷ = alpha + B1X1 + B2X2 + … + BjXj
- IVs can be continuous or discrete (will change the interpretation)
- alpha: Average value of Y when Xj = 0
- beta: Amount by which Y changes on average when Xj changes by one unit, holding the other IVs constant (partialled out/controlled).
- we can still use least squares method to estimate the intercept and slope
standardized regression coefficients
- effect of a standardized IV on the standardized DV (z-scores)
- change in the standard deviation of the DV that results from a change of one standard deviation in Xj, holding the other IVs constant
- Used to compare the effects of IVs on the DV, when the IVs are measured in different units of measurement