Chapter 6: The Beast of Bias Flashcards

Question 1

Q

Sources of bias

Answer

A

outliers
violations of assumptions (additivity/linearity, normality, homogeneity/homoscedasticity, independence)

Question 2

Q

stuff that can be affected by bias

Answer

A

parameter estimates (including effect sizes)
standard errors and CIs
test statistics and p-values
conclusions

there are methods of reducing bias

Question 3

Q

linear model and parameters

Answer

A

we can use the liner model to test theories or for prediction. in both cases, our interest is in estimating parameters

Question 4

Q

estimators

Answer

A

-estimation is the process of estimating parameters from sample data
- an estimator is a procedure, rule, or criterion that is used to estimate the parameters
- the result of estimation are estimates of the parameters
- estimates can be below or above the actual parameter value. a value above is called an overestimate, and below is an underestimate.
- in practice, we never know whether our estimates are above or below the parameter

Question 5

Q

qualities that make a good estimator

Answer

A

unbiasedness: on avg, its going to give you the population parameter. the distribution is not leaning towards one side or the other
consistency: as the sample gets bigger, the estimates become more precise
efficiency: not too spread (little error). mean is the most efficient, median is somewhat efficient, and mode is inefficient

a biased estimator is sometimes the preferred option, can be overcome with a bigger sample size
bias does not mean bad. a biased estimator is a method that will not equal the parameter, on average
bias is a property of an estimator, not an estimate

Question 6

Q

estimators and mean, median, mode

Answer

A

on average, the mean is going to give you estimates that match the population parameter, it is unbiased. the expected value of the sampling means is the parameter
the median is unbiased as long as the sample is normally distributed
the mode is unbiased as long as the sample is normally distributed

mean is the best estimator because it is unbiased , consistent, and efficient

Question 7

Q

OLS method

Answer

A

give estimates of the parameter while making sum of squares as small as possible

Question 8

Q

what is an outlier?

Answer

A

a score that is very different from the other scores
there are different kinds
outliers affect parameter estimates
have an effect on the parameters, but an even bigger effect on the SS
bias > SD > SE > CI (makes them much wider, which is an issue for significance testing)

Question 9

Q

overview of assumptions

Answer

A

if assumptions are violated, you can’t trust the test statistic
assumptions violations vary by degree
even if assumption is violated, some tests are still valid
assumptions about the characteristics of the data
some statistical tests are robust to violations of an assumption, meaning that the results are usually still valid even if the assumption is violated
parametric tests: statistical tests that make assumptions
nonparametric tests: don’t require assumptions about the distribution be met

Question 10

Q

additivity and linearity

assumption

Answer

A

the relationship between X and Y can be represented by a line
linear relationship between the predictors and the outcome
important that this is met because fitting a linear model to nonlinear data would be inappropriate

Question 11

Q

normality

assumption

Answer

A

the residuals of the model / the sampling distribution of the parameters (b’s) must be normally distributed
for CIs around a parameter estimate to be accurate, the estimate must have a normal sampling distribution
for significance tests of models to be accurate, the sampling distribution of what’s being tested must be normal
matters if we’re assuming that the residuals are normally distributed, using a linear model, the assumption of normality is important in choosing an estimation method. if assumption is met, use OLS method

Question 12

Q

central limit theorem

Answer

A

describes the relationship b/w a population of individual scores and the samplign distribution of the means (estimates)
as the sample size increases, the shape of the sampling distribution is going to approach normality, not matter the shape of the individual score distribution (parent distribution)
30 people
sampling distribution depends on sample size

Question 13

Q

homoscedasticity/homogeneity of variance

assumption

Answer

A

homoscedasticity: assumption that the population variances from each different group are exactly the same
homogeneity: different groups come from populations with the same variance
homoscedasticity is the same, but with a continuous variable
if assumption is violated, consider estimating the parameters using the weighted least squares method (WLS)
CIs and NHST considerably biased if assumption is not met

funeling indicates violation of homogeneity / heteroscedasticity

Question 14

Q

independence

assumption

Answer

A

-error terms in your model are unrelated to one another
- cannot trust CIs or NHST if violated
- use robust methods/HLM is violated
- if the errors aren’t independent, this gives a low estimate of SE, which affects CI/NHST

Chapter 6: The Beast of Bias Flashcards

(14 cards)