Statistics Flashcards
population
set of all individuals of interest in a study population = parameter
parameter
numerical value that describes a population can be a single measurement or set of measurements
sample
set of individuals selected from a population, representative of population in a study sample = statistic
statistics
numerical value that describes a sample can be a single measurement or set of measurements
descriptive statistic
statistical procedures that are used to summarize, organize, simplify data - make raw score meaningful e.g. mean, median, mode
inferential statistics
techniques that allow us to study samples then make generalizations about the population - infer sample -> population
sampling error
discrepancy/ amount of error that exists between a sample statistic and population parameter - important to consider in inferential statistics
construct
internal attributes/ characteristics that cannot be directly observed but are useful for describing and explaining behavior - hypothetical e.g happiness
operational definition
defines construct in terms of observable behaviors e.g. intelligence defines as performance on IQ test
nominal scale
categorical organization - can only measure qualitative difference e.g gender, country of origin, hair color
ordinal scale
categories organized in a certain sequence, differences are quantitative - amount between one person and next is not consistent e.g. class rank, rating scale
interval scale
ordered categories that are intervals of exactly same size with an arbitrary zero point - 0 does not mean the absence of the construct being measured e.g. celsius scale, temp
ratio scale
interval scale with absolute zero point - can describe differences between categories in terms of ratios (one thing is 3 times larger than another) e.g. weight, height, speed
discrete variables
separate, indivisible categories - whole numbers or specific categories - no decimals e.g 3 goals scores
continuous variables
infinite number of possible values that fall between any two observed values - divisible into infinite number of fractional parts e.g. height
real limits
boundaries of intervals for scores that are represented on a continuous number line - each score has two limits, half way between scores (upper real limit, lower real limit) e.g. if you have observed value of 8, actually represents range from 7.5 - 8.5 (kind of like rounding)
correlational method
two variables observed to see if there is a relationship between the two
experimental method
establishes cause and effect relationship between variables - must manipulate one variable, observe second - controlled research situation
non-experimental method
variable determines group (those that have depression) - don’t manipulate
independent variable
manipulated variable - 2+ treatment conditions
dependent variable
observed for changes to assess effect
control
does not receive manipulated experimental treatment, baseline for comparison
quasi-independent variable
groups not created by manipulating independent variable - participent variable (male/female) - time variable (before/after)
summation notation
a way to represent scores n ∑ xi i = 1 i = the starting point of the scores n = the stopping point

µ
population mean
x
sample mean
σ
population standard deviation
s
sample standard deviation
σ2
population variance
s2
sample variance
SS/n (df w/ sample)
P
population portion that have particular attributes
p
sample proportion that have particular attributes
ρ
population correlation coefficient
r
sample correlation coefficient
N
population number of elements
n
sample number of elements
H0
null hypothesis
H1
alternative hypothesis
α
alpha probability of a type 1 error
B
beta probability of a type 2 error
type 1 error
incorrect rejection of a null hypothesis
false positive
thinking there is an effect when there isnt
type 2 error
incorrectly retaining a false null
fals negative
thinking there isnt an effect when there is one
frequency distribution
organized tabulation of the number of individual scores located in each category on the scale of measurement - takes disorganized scores and placed them in order from highest to lowest - see entire set of scores at glance - categories based odd measurement scale - can be graph or table
grouped frequency distribution
when the data covers a wide range of values and it is unrealistic to list individual scores - rule 1: ~10 class intervals - rule 2: relatively simple width (2, 5, 10) - rule 3: interval starts with a score that is multiple of the width - rule 4: all intervals should be the same width
bar graph
uses horizontal or vertical bars to show comparisons among categories - nominal/ordinal
ogive
curve of the cumulative frequency distribution or cumulative related frequency distribution - express simple frequency as percentage of total frequency - cumulate and plot these percentages (e.g. lowest scores makes up 5%, next score makes up 6% but the cumulative frequency is 11% so that is what is plotted for score 2)

polygon
a line drawn to join all the midpoints of the top bars of a histogram - like an ogive, but does not use cumulative frequencies or smooth lines - to convert to ogive, add up percentages before each bar

histogram
an area diagram -> bars portray frequencies of possible values of a variable - continuous variables (this is why the bars touch) - set of rectangles along the intervals between class boundaries - areas proportional to the frequencies in corresponding classes

population distributions
cant find absolute frequency but can find relative frequencies e.g. don’t know how many fish encompass the population in a lake -> don’t know how many trout or salmon, after research can say that there are twice as many trout as salmon
percentile
score point below which a specified % of the scores in a distribution fall
- compute the percent * N
- round this figure so that it ends in .0 or .5 whichever is closer
- if rounded value ends in .5 the desired centile is the next higher value, if ending in .0 split the difference with the next higher score
percentile rank
precent of cases which are below a specific point in the distribution
- write down exact limits of the interval which contain the score whose rank is to be obtained
- interpolate between the cumulative percents to dind desired CR
exact limit/ cum %
Y/A
X/B
Z/C
X-Z/Y-Z = B-C/A-C
central tendency
descriptive statistical measure to determine a single score that defines the center of a distribution
goal: find one score that is most representative of the group
most common method of summarizing/describing distribution
mean
average; sum of scored divided by number of scores
appropriate when… no extreme outliers, no nominal scales
∑X/N
median
the score that divides the distribution of scores exactly in half
appropriate when… there are extreme outliers, no nominal scales, skewed distribution
N/2
mode
score or category that has the greatest frequency
appropriate when… you want answer to be correct as often as possible, nominal scales, discrete variables (hair color frequency)
how is the mean affected when adding/removing a new score?
will change mean, unless score is the same as the mean
how is the mean affected when adding/subtracting a constant to every score?
same constant is added/subtracted to the mean
e.g. 1,2,3 M = 2; now add 2 to each score: 3,4,5 M = 4
how is the mean affected when scores are multiplied/divided by a constant?
mean changes in the same way
e.g. 1, 2, 3 M = 2; now multiple all scores by 2: 2, 4, 6 M = 4
central tendency and its relation to symmetrical and skewed distributions
when choosing which measure is most valuable…
normal dist: all equal
skewed dist: median
negatively skewed: mean < median < mode
positively skewed: mode < median < mean

variability
quantitative measure of the degree to which scores in a distribution are spread out or clustered together
no variability: no difference between scores
small variability: small difference
large variability: large difference
range
the distance between the largest score and the smallest score
must compute in terms of real limits
problem: solely determined by two extreme outliers of distribution
calculate: substract lowest number from highest number
inter-quartile range
ignores any extreme outlier scores -> measures the range covered by the middle 50% of the distribution
separates scores into 4 equal parts with “cuts” either between or on certain scores
interquartile range is distance between Q1 and Q3 (top 25% to lowest 25%)
calculate: order from least to greatest, find median/middle number, calculate the median of the first half, calculate median of the 2nd half, substract the smaller half from the larger half

semi-interquartile range
half of the inter-quartile range
middle 25%
divide interquartile range in half
standard deviation (SD)
most commonly used and most important measure of variability
takes into account all values of a variable
mean = reference point; measures variability by considering distance between each score and the mean
determines whether scores are generally near or far from mean, how much they deviate from the mean
SS (sum of square deviations) - population
∑(X - µ)2
find the deviation score: x - µ
compute this for each score, be mindful of +/-
square each deviation score (X - µ)2
add up all the deviation scores ∑(X - µ)2
this is SS
variance - population
take SS divide by N
∑(X-µ)2 / N
large score = more variability = more scores are spread out = BAD
standard deviation - population
take square root of variance
SS/N = σ2 <- this is variance
√σ2 <- standard deviation
SS ( sum of square deviations) - sample
find deviation score x - M
compute for each score
square each deviation score (x - M)2
add up all deviation scores ∑(x - M)2 <- this is SS
variance - sample
take SS divide by n-1
∑(x-M)2 / n - 1 = s2
standard deviation - sample
square root variance for standard deviation
√s2
unbiased statistic - how to correct?
unbiased statistic is an accurate representation of the population
n - 1 in sample variance will correct for bias in sample variability
z-score
provides a precise description of a location in a distribution
describes number of SD forom mean
describes how common/exceptional a score is compared to others
positive z-score = above the mean, negative z-score = below the mean
transforming z-scores

standardizing distributions
compare scores across test forms
same shape as origianl distribution (scores renamed, but same location)
e.g. z-score distribution
when transforming x scores to z-scores, new M = 0, new s = 1

probability
likelihood that something will happen
way to quantify randomness
smaller # -> less likely
over the long run
p = (# of certain outcome)/(#of all possible outcomes)
probability is similar to findign percentile rank: what is the probability of having an IQ of 120 is the same as percentile rank of x = 120
experiement (probability)
act of flipping a coin or dice
mutually exclusive events
cannot happen at the same time - rolling a 2 and 6 on a die cant happen simultaneously
independent random sampling
probability of being selected is independ of the individuals already selected
each individual in population has equal chance of being selected
ensures that the probability of particular outcome does not depend on previous outcomes
sampling with replacement
returning selections back to the population
probability of picking out a red m&m 1/10 - pick out an m&m, replace. probability is stil 1/10 instead of 1/9, 1/8, etc.
Unit normal table for probabilities in a normal distribution
transform score to a z-score (z = x-M/s) (x = M + zs)
look up in unit normal table - proportions are always positive, even if z-score is negative
negative z-score: tail is on the left, body on the right
positive z-score: tail on the right, body on the left

distribution of sample means
set of means from all possible random samples (w/ replacement) of n from a population
the larger the n, the smaller the st. error of the mean (means from multiple trials) -> because there is less error between the sample mean and the population mean.
the more people in the study, the less error between the sample and the population
- sample means should be centered around population mean
- expected that M = µx
- the sample mean is an unbiased estimator of the population mean
- distribution of sample means will approach a normal distribution even if original dist. is skewed.
standard error of the means
σM = σ/√n
sample mean relationship in distribution of mean
each sample mean, M, has a location in the distribution of sample means
can be described in a z-score
calculate: Z = (M-µ)/σM
M of sample means - individual mean/standard error of the mean
hypothesis testing
determining whether the sample is representative of the population or merely the result of chance
null hypothesis
suggests that there are no difference between groups
no effect
assume null hypothesis is true unless data prove otherwise
alternative hypothesis
suggests there IS a difference between groups
there is an effect
test statistic
of standard errors the sample value is removed from the null value
use to determine whether to reject the null
compared your data with that is expected under the null
e.g. z-score
aplha level
probability of making a type 1 error
decreasing significance level -> decreases chance for type 1 error but increase chance for type 2
critical region
composed of the extreme sample values that are very unlikely to be obtained if the null is true
boundaries determined by alpha level
if sample data fall in the critical region, null is rejected
calculate:
- define alpha
- use unit normal table to find which z-score to be larger (+) or smaller (-) than the critical region levels
hypothesis testing steps
- state hypothesis (one tailed or two tailed - lower response vs. have a effect)
- set the criteria
- alpha level
- find critical regions - collect data and evaluate
- calculate standard error
- calculate z-score - make a decision
- reject null -> sample data in criical region, tx had an effect
- fail to reject null -> treatment doesnt have an effect, not in critical region

effect size
magnitude of the treatment effect
Cohen’s D
.2 = small effect
.5 = medium effect
.8 = large effect
calculate: µtx - µnotx / s
power
probability that the test will correctly reject the null hypothesis
helps determine # of participants needed
related to effect size -> higher effect size = higher chance of rejecting the null (both provide magnitude of tx effect)
decrease standard error between two distributions -> increase # of subjects
factors that affect power: sample size, alpha level, 1 tailed vs. 2 tailed
R2
another way to calculate effect size - the amount of variability/percentage of variance accounted for
.1 = small effect
.09 = medium effect
.25 = large effect
t - statistic
z stat used with unknown populatio mean and known standard deviation
t stat used to test hypothesis about an unknown population mean when the standard deviation is unknown
only difference between t and z is estiamted standard error
calculate: t = M - µ / Sm
difference between sample mean and population mean divided bt difference expected by chance
hypothesis testing using t - stat
- set up hypothesis H0: M1 = M2; H1: M1 doesn not = M2
- set the criteria
- set alpha
- find critical region - collect data and evaluate
- calculate variance or SD (s2= ss/n-1 = ss/df)
- calculate estimated standard error (sm = s/√n)
- calculate t-stat (t = M - µ/ sm) - make a decision
percentage of variance explained - r2
r2 = t2/ t2+df
independent measures t test
comparing means of 2 independent groups
uses separate sample for each of the tx populations compared
examine difference between population means of 2 independent groups
assumptions
- independent obersvations -> one observation doesnt affect probability of other observations
- normal distribution
- populations have equal variance -> homogeneity of variance

hypothesis test for independent measure t-test
- state H0 and H1
- H0: µ1 = µ2 OR µ1 - µ2 = 0
- H1: µ1 ≠ µ2 OR µ1 - µ2 ≠ 0 - identify critical regions based on alpha
- calculate total df (df = df1 + df2)
- find critical region boudaries in t distribution table - evaluate assumptions
- compute statistics
- pooled variance
- estimated standard error
- independent samples t statistic - make decision regarding H0
- independent measures t test gives us total amount of error involved in using 2 sample means to estimate 2 population means
- tells average distance between the sample difference and population difference
- estimate the standard error using the sample standard devision or variance and, since there are two samples, we must average the two sample variances.
pooled variance
account for both standard errors, find them separate and then add together.

estimated standard error

estimated Cohens D - t-test
measures treatment effect
mean difference divided by standard deviation (estimated standard error b/c its a t-test)
M-µ/s
repeated measures design
repeatedly measures same individuals to assess change (within-subjects)
- same sample, test twice, before/after tx
- same subjects are being tested under different conditions
hypothesis testing repeated measures t - test
difference score (D) - change in an individuals score between two measures
- state null and alternative
H0: D = 0
H1: D ≠ 0
- select alpha and criticial values
- compute the t statistic
(do not have to compute pooled variance because it is one group)
- estimates standard error
- dependent sample t statistic
4. make your decision

dependent sample t statistic

r2 for repeated measures

repeated measures (adv./disad.)
advantages
- allows researcher to exclude effects of individual differences (own control group)
- requires fewer participents -> easier to recruit
- study individuals over time
disadvantages
- order effects
- variance reduced
- other things can affect -> history, maturity, attrition, testing, instrumentation
independent measures (adv./disad.)
advantages
- order effects is not a problem
- does not require as many materials as repeated measures because different people are being studies so you can reuse materials
disadvantages
- individuals differences
correlation
measures and describes a relationship between two variable

pearsons correlation
sum of products
calculate mean for x and y
find deviation scores (x-M)
multiply deviation score x and deviation score y
add these
(possibly more)
take this amount minus (∑X)(∑Y)/n
all together… ∑XY - (∑X)(∑Y)/n
spearman correlation
spearman uses ranks, one or both variables are ordinal
d = differece in rank scores
tied scores?
- list scores smallest to highest
- assign rank
- if tied, compute mean fo their ranked positions and assign this value as final rank for each score

linear equation
line of best fit
y = mx + b
- m = slope of the line
- b = y-intercept
least squared error solution
approach in regression to find the approximate solution of overdetermined systems (set of equations with more questions than unknowns)

linear regression equation
all you need is slope and the y-intercept to create a line of best fit
y = bx + a
b = SP/SSx

ANOVA
used to evaluate the diffrence between two or more sample meansm, compared variances
ANOVA is used because multiple t-tests -> more error
compares between tx variance with within tx variance
advantage: performs all tests with one hypothesis and one alpha, avoids the problem of inflated experiement-wise alpha
hypotheses: null = all means are equal, alternative = there is at least one mean difference among the populations
ANOVA factors
number of independent variables
between subjects = different subjects used for different levels of the factor
within subjects = same subjects used for the different levels of the factor
ANOVA levels
number of conditions
ANOVA between tx variance
measures diffrences caused by
- systematic tx effects
- random, unsystematic factors
ANOVA within tx variance
measures differences caused by
- random, unsystematic factors
when are posts tests necessary for ANOVA’s
post tests are used when significant results are found and when additional exploration of the differences among means is needed
provided specific info on which means are significanly different from each other
ANOVA effect size
r2 = ssb/ss total
- this is the percentage of variance accounted for by the treatment
Chi-square test
determines association between 2 categorical variables
- when scores violate assumptions of a parametric test
- > not normally distributed
- > unequally high variances
- usually high variance
- undetermined or infinite scores
- this test determines how well the obtained sample proportions fit the population proportions specified by the null hypothesis
e. g. relationship between personality and color preference
hypothesis test for chi-sqaure test goodness of fit
hypotheses
H0: equal proportions or no difference from a known population
Example: Men 50%, women 50%
H1: unequal proportions or a difference from known population
F0 = observed frequency
- represent rela individuals
- always whole numbers
Fe = expected frequency (proportion times n)
- predicted from the proportios in the null hypothesis and the sample sie
- defines an ideal, hypothetical sample distribution that would be obtained if the sample proportions were in perfect agreement with the proportions specified in the null
chi-square stat
df = C-1 (C= # of categories)
use table to determine if stat is in crtiical region

differences between F0 and Fe
small
- small value for chi-sqaure
- conclude there is a good fit between data and hypothesis
- fail to reject null
large
- large chi-sqaure
- reject the null
- want a large value for chi square!
chi square for independence
variables are independent when there is no consistent, predictable relationship between them
- two variables independent -> frequency distribution for one variable has same shape for second variable
- if there is no relationship between 2 variables (null) -> distributions have equal proportions (null)
each individual classified on each of the 2 variables
- frequency distribution for sample tests hypothesis about corresponding frequency distribution for population
- H0: distributions are the same (no differences, no relationship)
phi-coefficient
.1 = small
.3 = medium
.5 = large

cramers v
df small medium large
1 .1 .3 .5
2 .07 .21 .35
3 .06 .17 .29

percentage of variance accounted for - t-test
r2= t2/t2+df