week 15: hypo testing III Flashcards
inferential statistics
- characteristics of a total population based on a sample
- the validity of the inference depends on the quality of the sample
- generalization requires a representative sample
- no sample will be a perfect replica of a population
- an imperfect sample does not negate the value of a study, but it limits the conclusions that may be drawn from it
what is the formal definition of inferential stats
determine the probability that the null hypothesis accurately represents the conditions in the population; if the statistical test is significant, then the researcher will reject the null hypothesis
what does it mean if a statistic is significant at 0.05 level
the probability of error is less than 5 in 100
what does it mean if stats are not significant
the probability of error in rejecting the null hypothesis is unacceptably high, unusually beaus the chance of error is greater than 5 in 100
type 1 error
reject the null hypo when it is correct
- saying something is significant when it is not
- could be from sampling issues
type 2 error
occurs when a researcher fails to reject the null hypothesis when it is incorrect
- saying no significant difference when there is
- -could be from lack of power so want more participants to fix this
how to chose a statistical test
depends on the level of measurement (ordinal, interval, or ratio)
parametric statistics
recommended for interval or ratio level measurements
*observations that fit certain assumptions about the distribution of data around the mean (normal distribution) and larger sample sizes
nonparametric statistics
recommended for ordinal level measurements, situations in which you are uncertain about the distribution of the data, and have smaller sample sizes
t-test or analysis of variance (anova)
- parametric statistics–probability distribution
- require continuous (interval or ratio) data or suitably transformed data
- powerrful and convenient for testing Ho
- student’s t test (leptokurtic)
- –n is less than 30 and depends on the degrees of freedom
- z-test is n is igger than 30
- all assume random sample
chi-aquared test
usually the best choice for categorical variables
- nonparametric stats–a test of independence
- typically used to analyze data that are too weak to analyze with a t-test or anova
regression
evaluate the relationship between variables and allow prediction
test for testing the difference between two samples
- t test is the most common for analyzing the difference between two sets of data when you have interval or ratio level measures
- –specifically tests the difference between means to determine if the two groups are significantly different
3 factors affecting whether you will find a significant difference with a t test
- magnitude of the difference between means
- amount of variability of the data
- sample size
related t test
two measures are performed on the same participant
—this is a paired t-test
independent t test
independent samples (between subjects) *perform an independent t test
paired samples designs
within the same subject
- if it is not attributable to chance the Ho is rejected
- examples of designs: pretest/posttest or same subjects designs with two conditions
- assumptions:
- –continuous data
- –random samples
- two sets of scores are correlated
- –the difference scores distributions are normal
- wilcoxon signed-rank test for paired data is the nonparametric test
independent samples design
includes 2 groups of subjects
- –known as between subjects design
- independent samples t test tells if the observed difference between two groups is attributable to chance
- t test is samples are above 30 and student’s t is below 30
- assumptions
- –continuous data
- –random samples
- –2 independent and unrelated groups
- –normal distribution
- –groups have equal variance
confidence interval
confidence interval is establishing the precision of a result and deciding whether to accept or reject the Ho
- a way of estimating the margin of error associated with the study
- –larger groups have smaller CI and the more the CI overlap for each group the less definitive the group differences
- if the CI contains 0, the Ho is accepted
- the width of the CI indicates the precision of measurement, which depends on the sample size
what is the outcome of a t-test and what do you do with it
- outcome is known as the t-ratio using the pooled variance
- to test the null hypo, the observed value for t is compared to the critical value
- –if the observed value is greater than the critical value the null hypo is rejected and the alt hypo is accepted
ANOVA
analysis of variant and is often used when there are more than two groups or two conditions
- –gives an f statistic or f ratio
- assumes
- –continuous data
- –random sampling
- –normal distributions
- –equal variances
- analyzed many groups at one time
- is related to the t statistic f=t squared
one way anova
- separates variance into two distinct parts
- –within group varibality which is the portion of variability that cannot be explained by the research design; this is known as the mean square for error
- –between group variability is the portion of variance attributable to group membership (effect) this is known as the mean square for effect
- the two variances are compared in order to test whether the ratio of variances is significantly greater than 1
- Kruskal-Wallis is nonparametric
factorial designs
- investigates two or more independent variables in the same study
- can use two way anova, 3 way anova, or more but rare with more than 4-5 factors
- two way anova enables you to determine if there are differences associated with main effects as well as any interactions between the levels
two way anova
- a significant interaction means that the outcome for one of the independent variables was different depending on the level of another independent variable
- –useful to plot the individual means on a line graph to show the interaction between the main effects
- nonparametric = friedman’s test
- –does not evaluate interaction effects
repeated measures anova
available for situations in which researchers obtained several measures from the same participants
- these kinds of designs involve participants being observed under 3+ experimental conditions
- could be one way or two way depending on the number of independent variables
mixed models anova
- analysis of randomized pretest/posttest design
- deemed “mixed model” because it has both repeated measure factor and between group factor
- sources of variance:
- –participants
- –between group factor
- –repeated measures factor
- –interaction (treatments by time)
- –error term
post-hoc comparisons
- when multifactor anova identifies significant effects, the location of the effect is not always evident
- –significant main effect simply indicates the presence of at least one significant difference between means
- –to identify the specific pairs of means that are significantly different, researchers use a post hoc comparison
measures of association
used when researchers are interested in investigating the relationship between two or more sets of scores
- gives us info about the strength of the relationships as well as the direction of the relationship
- commonly used:
- –correlation coefficient
- –regression models
- –chi-squared analysis
- –contingency coefficients
correlation coefficient
- pearson product-moment correlation coefficient
- spearman rank-ordered correlation coefficient
- interpretation:
- –direction of the relationship
- –magnitude of the correlation
- –the possibility that the observed relationship between the variables occurred because of random chance
regression models
allow prediction of values
*simple regressions with one independent and one dependent variable or multiple regression
another term for categorical data
count data
*used for nominal measures
chi squared tests
- analysis of count data
- –the observed frequencies (counts) differ from the frequencies expected by chance?
- these test have many limitations and assumptions:
- –individual observations must be independent of each other
- –observations for analysis must be count data
- –the sum of the expected freq must equal the sum of the observed freqs
- –the categories must be exclusive of one another
- –the expected values for any one cell must not be too small
contentment tables
the rows-by-columns tables that are used to organize the data for X2 analysis
how is the practical significance of a statistical outcome expressed
effect size
- statistical significance does not equal practical significance
- effect size is important and should always be reported!
simple effect size
the raw difference between treatment means
- useful for easily interpretable units of measurement
- not good when there is a lot of variability involved in the measurement, and it cannot be compared between studies as it is not a standard unit
effect-size correlation
the correlation between the independent variable and the individual scores
*r is good for the directionality and size and r2 is the possibility of chance
the standardized effect size
most commonly used
- accounts for variability and provides a statistic that can be used to compare with the results of other studies
- many measures
- –cohen’s d which is best used for large samples
- –hedge’s g which is better for small samples
- –eta squared which is best for anova and is similar to r squared
the problem of unequal sample sized
- affects power and efficiency of experimental design
- to identify the amount of loss on the power and efficiency is done by comparing the arithmetic and harmonic means of the samples