extras Flashcards
what to remember when describing a distribution
- centre - median need to SAY median
- Spread - IQR - such that the middle 50% of scores are situated btw x and y + max and min
- Shape - peaks and distribution of scores + skewness
- any outliers?
Outliers can occur because of?
sampling error
participant error
researcher error
random chance
probability density functions
hypothetical population distribution are defined using mathematical formulas known as pdfs - give the probability of observing a particular value of a variable
total area under the curve defined by a probability density function always equals 1
normal distribution is a…
hypothetical population distribution
should you describe a sample as normal?
No, it approximates a normal distribution
standard normal distribution
Normal distribution with u=0 and o=1
z score if x is an observation from a normal distribution - z-score of x is
z = x-u/o
Z scores follow what kind of distribution…
follow a normal distribution with u=0 and o=1
sampling distribution
We can imagine collecting an infinite number of samples of N = 40 Peabody scores, leading to an infinite number of sample means and standard deviations.
each of these samples came from the same population, then each sample
mean is an estimate of the same population mean, , and each sample standard deviation is an estimate of the same population standard deviation, .
Because of sampling error (not “bias”!), very few, if any, of these mean and standard deviation estimates will exactly equal the true population mean and standard deviation.
creating a frequency distribution table or graph for the collection of sample means obtained from repeatedly collecting different samples of size N = 40 from the same population. This collection of sample means would form the sampling distribution of the mean.
A sampling distribution is the distribution of a …
statistic
Sampling distributions are blank blank distributions
theoretical population distributions
Central limit theorem
Describes the sampling distribution of the mean
also applies to sample regression slope estimates
Central limit theorem - for means calculated from samples drawn from any parent population with the mean and sd, the sampling distribution of the mean will converge to a normal distribution with mean u and sd o/sqrtN - as N approaches infinity.
standard error is what
standard error of a statistic is the standard deviation of that statistics sampling distribution
o/sqrtN and is often represented as o xbar
average amount that that a sample mean xbar is expected to be different from the population mean u
Z score for individual
z = x-u/o
zscore for a sample mean
z = xbar - u/o/sqrtN
point estimate
single value used as an estimate of a population parameter
what are point estimates influenced by?
point estimates are calculated using data from random samples drawn from a much larger population so they are influenced by sampling error
variation of a point estimate from one sample to another represents the extent of sampling error
Sampling error and sample size
smaller samples have more sampling error than larger samples
point estimates from small samples, more sampling error
standard error of the mean formula- bigger N gets, smaller standard error gets - less sampling error with larger N
CI from small samples have more sampling error than from larger samples = wider CI
Confidence interval does what?
Conveys the degree of sampling error around a point estimate by presenting a range of plausible or reasonable values for the population parameter of interest.
CI is a range of values or an interval that is expected to capture a population parameter of interest with some prespecified level of confidence.
gives the precision of a point estimate
What does the Central Limit Theorem tell us about sample means?
Sample means can be treated as observations from a normal distribution.
Interpretation of a confidence interval
This interval captures u with 95% confidence
Factors affecting the width of a confidence interval that are under the researcher’s direct control:
level of confidence
sample size
Type I error
is the rejection of a true null hypothesis. The probability of a Type I Error is alpha (a), given that the correct statistical model has been used to test H0.
Type II error
is the failed rejection of a false null hypothesis. The probability of a Type II
error is beta ().
Power
Power is the probability of rejecting a false null hypothesis. Power is the complement of the probability of Type II error
What is power greater for?
larger sample sizes and for larger effect sizes
statistical model
represents the value of a dependent variable (often symbolized with the letter y) as a function of one or more parameters plus an error term.
General Linear Model
and thus all models we examine will express the dependent variable as a linear function of the parameter(s).
error variance,
which represents the extent
that professor salaries differ from the mean salary
In an intercept-only model, the error variance is equivalent to the variance of the dependent variable
t distribution is used when
using sample estimate of the standard error of the mean
t distribution has higher kurtosis that results from the added uncertainty due to estimating the standard error
The particular T distribution used depends on what?
the degrees of freedom
When df = infinity, t distribution =
standard normal distribution
t stat formula
t = ybar - uo/sybar
uo = population mean value given by the null hypothesis
One sample T test report
The mean nine-month salary for professors was M = $113,706.46 (SD = 30,289.04), with 95% CI [110,717.90, 116,695.10]. A one-sample t-test confirmed that this mean significantly differs from the U.S. population median salary, t (396) = 41.76, p < .001
Effect size
magnitude of the association
difference between two means
Assumptions for a one-sample t-test
- independent observations
- sample data come from normal pop distribution