final Flashcards
-Different participants are observed onetime in each group or at each level of a factor
between-subjects design
Levels of a between-subjectsfactor are manipulated, then different participants are randomlyassigned to each group or to each level of that factor and observedone time
between-subjects experimental design
Type of factor in which different participants areobserved in each group or at each level of the factor
between subjects factor
A statistical procedure used to test hypotheses concerning thedifference between two population means, where the variance in oneor both populations is unknown.
* Making comparisons between two independent groups
two independent sample t test
refers to the selection of participants such that different participants areobserved one time in each sample or group
independent samples
categorical variable (school) with two levels(Towson and JMU)
IV
continuous variable (IQ)
DV
The inferential statistic you use to look at ___between groups is the independentsamples t-test
mean differences
assume data in each population are normally distributed
normality
assume that the data were obtained using a randomsampling procedure
random sampling
each measured outcome is independent; one outcome isnot influenced by any other
independence
assume variance in each population is equal to each other. This is satisfied when larger variance is not greater than two times that of the smaller
equal variances (homogeneity of variances)
how to determine homogeneity v hetero. of variance?
levene’s test
if we reject null, SPSS reports t-test that does not assume…
equal variance
The null states that there is no difference and the alternative states that the difference is different than ___
0
Remember that we will only really report an effect size IF we ___
reject the null
place the difference between two sample means in the numerator and the pooled sample standard deviation in the denominator
estimated cohen’s d
results of t-test
- The value for the test statistic
- Degrees of freedom
- p value
Cohen’s d is most often reported with ___tests
t
participants in each group or sampleare related
related or dependent sample
they are observed in more than one group
repeated measures
They are matched, experimentally or naturally, based on the common characteristics or traits that they share
matched pairs design
repeated measures design
- A research design in which the same participants are observed in each sample.
- This is the most common related samples design
type of repeated measures design in which researchers measure a dependent variable for participants before (pre) and following(post) some treatment
pre-post design
type of repeated measures design where researchersobserve the same participants across many treatments but not necessarilybefore and after a treatment
within-subjects design
Matching through ___ is typical for experiments in which the researcher manipulates the traits or characteristics used to match theparticipants
experimental manipulation
Matching through __ is typical for quasi-experiments in which participants are matched based on their preexisting traits or characteristics
natural occurrence
a statistical procedure used to test hypotheses concerning two related samples selected from populations in which the variance in one or both populations is unknown
* Comparing mean difference between pairs of scores in population to those observed in a sample
related samples t test
score or value obtained by subtracting two scores. In a related samples t test, this is obtained prior to computing the test statistic
difference score
the larger the value, the ___ likely a sample mean difference could occur if the null hypothesis were true
less
The degrees of freedom for the related-samples t test
n-1
we assume thatdifference scores were obtained from different individualswithin each group or treatment
independence w/in groups
An estimate of the proportion of variance in a dependent variablethat can be explained by a treatment
proportion of variance
in hypothesis testing, we are often interested in ___ than two groups
more
is a statistical procedure used to test hypotheses for one or more factors concerning the variance among two or more group means, where the variance in one or more populations is unknown
An Analysis of Variance (ANOVA), also called the F test
one factor tested
one way anova
two factors tested
two way anova
research design in which we select independent samples,meaning that different participants are observed at each level of a factor
between-subjects design
of groups
k
of participants per group
n
of total participants in study
N
any variation that can be measured in a study. In the one-waybetween-subjects ANOVA, there are two sources of variation
source of variation
variance of group means
between groups variation
variation attributed to error
within groups (error) variation
the distribution of possible outcomes for the test statistic is ___ skewed
positively
The distribution, called the F distribution, is derived from a sampling distribution of F ___
ratios
The variances measured in an ANOVA test are computed as ___ squares, or variance
mean
It is computed as mean square (or variance) betweengroups divided by the mean square (or variance) withingroups
F statistic
variance between groups/variance within groups
F obtained
As M1, M2, M3 deviatefrom one another (getfurther apart), their variance will ___
increase
As variances within-conditions ___, discrepancy of the means is less meaningful
increases
As variances within-conditions ___, discrepancy of the means is more meaningful
decrease
statistical procedure computed following a significant ANOVA todetermine which pair or pairs of group means significantly differ
post hoc test
These tests are necessary when k > 2 because multiple comparisons are needed
post hoc
All post hoc tests control for experimentwise ___
alpha
Bonferonni is far too ___
conservative
Tukey’s HSD is most -__
conservative
Fisher’s LSD is most ___
liberal
a statistical procedure used to test hypotheses for one factor with two or more levels concerning the variance among group means. This test is used when the same participants are observed at each level of a factor and the variance in any one population is unknown
one-way within-subjects ANOVA
An ANOVA determines whether ___significantly vary in the population
group means
estimates how much of the variability in the dependent variable can be accounted for by the levels of the factor
proportion of variance
The within-subjects design is associated with ___ power to detect aneffect than the between-subjects design because some of the error in thedenominator of the test statistic is removed
more
The power of the one-way within-subjects ANOVA is largely based upon the assumption that observing the same participants across groups will result in more ___ responding, or changes in the dependent variable, between groups
consistent
statistical technique that describes the linear relationship between two variables.
correlation
Variables are usually observed in ___ environment, with no manipulation by the researcher;
natural
however, correlation methods can be applied to ___ data as well
experimental
statistical procedure used to describe the strength and direction of thelinear relationship between two factors
correlation
In a correlation, we treat each factor like a ___ variable and measure the relationship between the pair.
dependent
In behavioral research, we mostly describe the ___ (or straight line) relationshipbetween two factors
linear
Correlations are typically illustrated through a ___ plot
scatter
used to measure the strength and direction of the linear relationship, or correlation, between two factors (-1 to +1)
correlation coefficient
Values closer to ___ indicate stronger correlations
±1.0
indicates only the direction or slope of the correlation
the sign of the correlation coefficient
the values of the two variables move in the same direction
positive correlation
the values of the two variables move in the opposite direction
negative correlation
reflects how consistently scores for each factor change
strength of correlation
When plotted in a graph, scores are more consistent the closer they fall to a ___ line
regression
means there is no linear pattern between two factors
zero correlation
occurs when each data point falls exactly on a straight line
perfect correlation
-the best fitting straight line to a set of data points. A best fitting line is the line that minimizes the distance of all data points that fall from i
regression line
linear vs. non-linear
form
Slopes that “flatten out” at the top are common in the social sciences
monotonic
used to measure the direction and strength of the linear relationship of two factors in which the data for both factors are measured on an interval or ratio scale of measurement
pearson correlation coefficient
The value in the ___ reflects the extent to which values on the x-axis (X) and y-axis (Y) vary together
numerator
the extent to which the values of two factors vary together
covariance
The extent to which values of X and Y vary independently, or separately, is placed in the ___
denominator
correlations
describe the degree of linear relationship between 2 variables
* Does not explain why the variable are related
* Does NOT indicate cause and effect relationship
is just an index, not a measurement scale
correlation coefficient
A correlation can be used as a ___ statistic
descriptive and inferential
mathematically equivalent to eta-squared, and is used to measure the proportion of variance of one factor (Y) that can be explained by known values of a second factor (X)
coefficient of determination
01 = small; .09 medium; .25 large
r^2
assumption that there is an equal (“homo”)variance or scatter(“scedasticity”) of datapoints dispersed along the regression line
Homoscedasticity
assumption that the best way to describe a pattern of data is using a straight line
linearity
is a problem that arises when the direction of causality between two factors can be in either direction
reverse causality
or third variable, is an unanticipated variable that could be causing changes in one or more measured variables
confounding variable
Outliers can obscure the relationship between two factors by altering the ___of an observed correlation
direction and strength
A problem that arises when the range of data fo rone or both correlated factors in a sample is limited or restricted, compared to the range of data in the population from which the sample was selected
restriction of range
All correlation coefficients are derived from the ___ correlation
pearson
To summarize correlations, report the ___for each correlation coefficient
strength, direction, and p value
means that additional factors do not enter into or confound the XY relationship
nonspuriousness
SS regression + SS residual
total variance
or known variable (X)–the variable with values that are known and can be used to predict values of another variable
predictor variable
or to-be-predicted variable (Y)–the variable with unknown values that can be predicted or estimated, given known values of the predictor variable
criterion variable
linear relationship between two variables (x and y)can be described by the equation y = a + b(x) + e
linear equation
This parameter that we estimate tells us the value of y when x equals 0
y-intercept
The ___(formed by the equation y = a + b(x)) is the one line that best describes theset of data
regression line or line of best fit
SS(xy)/SS(x)
slope
statistical procedure used to test hypotheses for one or more predictor variables to determine whether the regression equation for a sample of data points can be used to predict values of the criterion variable (Y) given values of predictor variable (X) in the population
analysis of regression, or regression analysis
variance that is related to changes in X
regression variation
Variance that is not related to changes in X
residual variation
Analysis of regression measures only variance in ____, because that is what we want to predict
Y
an estimate of the standard deviation or distance that a set of data points falls from the regression line. The standard error of estimate equals the square root of the mean square residua
standard error of estimate
the regression line allows us to make predictions but it doesn’t provide information about the ___ of the predictions
accuracy
Provides a measure of the typical distance between a regression line (predicted scores) and the actual data points
standard error of estimate
As the correlation increases (gets closer to 1.00 or -1.00), the data points are more tightly clustered around the regression line, and thus we will have better prediction (___ prediction error)
less
As the correlation gets smaller (approaches 0), the data points arespread further out from the regression line, and thus we will haveworse prediction (____ prediction error)
greater
parameter estimate based on the sample will not, on average, equal true value of regression coefficient in population
bias
statistical method that includes two or more predictor variables in the equation of a regression line to predict changes in a criterion variable
multiple regression
One advantage of including multiple predictors in the regression equation is that we can detect the extent to which two or more ___ variables interact
predictor
A family of statistical procedures that do not rely on the assumptions of parametric tests. In particular, they do not assume that the sampling distribution is normally distributed
nonparametric tests
statistical procedure used to test hypotheses about the discrepancy between the observed and expected frequencies for the levels of a single categorical variable or two categorical variables observed together
chi square test
Indicates how well a set of observed frequencies fits with what was expected
chi square
The null hypothesis for the chi-square goodness-of-fit test is that the expected frequencies are ____
correct
The alternative hypothesis is that the expected frequencies are ____
not correct
The larger the discrepancy between the observed and expected frequencies, the more likely we are to ___ the null hypothesis (chi square)
reject
positively skewed distribution of chi-square test statistic values for all possible samples when the null hypothesis is true
chi square distribution
The df for each chi-square distribution are equal to the number of levels ofthe categorical variable (k) minus 1: df =____
k - 1
Because the chi-square distributionis positively skewed, the rejectionregion is always placed in the ___ tail
upper
(1) is not interpreted in terms of differences between categories* (2) can be used to confirm that a null hypothesis is correct
chi square test
(chi) We compare the discrepancy between observed and expected frequencies at each level of the ____ variable, thereby making a total of k comparisons.
* Do not compare across ___ of the categorical variable
categorical; levels
(chi) Because the test statistic does not compare differences between the discrepancies, there is no statistical basis for identifying which discrepancies are actually ___
significant
When a chi-square goodness-of-fit test is significant,we mostly speculate as to which ___ frequencies were significantly different from the expected frequencies.
observed
The chi-square goodness-of-fit test is one of the few hypothesis tests used to confirm that a null hypothesis is ____
correct
Decision to ___ the null is the goal of the chi-square goodness-of-fit test
.* It is a rare example of a test used for this purpose.
retain
A key assumption for the chi-square goodness-of-fit test is that the observed frequencie sare recorded ____
independently
(chi) One restriction using this test is that the size of an expected frequency should never be smaller than ____ in a given category
5
statistical procedure used to determine whether frequencies observed at the combination of levels of two categorical variables are similar to frequencies expected
chi square test for independence
The chi-square test for independence is interpreted similar to a ____
correlation
If two categorical variables are ____, they are not related or correlated
independent
If two categorical variables are ____, they are related or correlated
dependent
The phi correlation coefficient can be used in place of a _____ chi-squaretest for independence
2x2
The _____coefficient and the chi-square test statistic are related in that we can use the value of one to compute the other
phi correlation
measures of effect size for chi-square of independence
- Phi coefficient
- Cramer’s V
To summarize the chi-square tests for independence, also report the ____
effect size