Lecture 9 Flashcards
Statistics are an _____ way of interpreting a collection of _____. In other words, it is how we ____ data we have collected.
- objective
- observations
- analyze
2 types of statistics:
- descriptive techniques
- inferential techniques
5 purposes in selecting tools for analysis:
- to describe
- to compare
- to associate
- to predict
- to explain
How is selecting tools for analysis to describe?
What are the characteristics of some groups or groups of people (eg. standard deviation)
How is selecting tools for analysis to compare?
are two or more groups the same or different on some characteristic? (eg. t-test)
How is selecting tools for analysis to associate?
are 2 variables related and what is the strength of this relationship? (eg. correlation coefficient)
How is selecting tools for analysis to predict?
can measures be used to predict something in the future? (eg. regression)
How is selecting tools for analysis to explain?
given some outcome or phenomenon, why does it occur? (eg. structural equation modeling)
2 broad types of numerical data:
- continuous
- discrete
Continuous data:
measurement theoretically possible at any point along a continuum
Discrete data:
limited to a specific number of values
Descriptive stats is used to:
- organize
- simplify
- summarize the collected data
2 ways of describing data numerically:
- central tendency
- variation
Central tendency consists of:
- arithmetic mean
- median
- mode
Variation consists of:
- range
- interquartile range
- variance
- standard deviation
- coefficient of variation
Measures of central tendency indicates the _____ around which scores tend to be _____.
- points
- concentrated
Mean:
- the sum of scores divided by the number of scores
- used with interval or ratio data
Most common measure of central tendency:
mean
Median:
- middle score
- used with ordinal data
Mode:
- most frequent score
- used with nominal/categorical data
Measures of variability describes data in terms of its _____ or _______.
- spread
- heterogeneity
Easiest measure of variability:
range
Range:
- difference between the highest and lowest score
- ignores the distribution of data and is sensitive to outliers
Interquartile range:
- eliminate some outlier issues
- eliminating high and low valued observations and calculate the range of the middle 50% of the data
Variance:
average of squared deviations of values from the mean
Standard deviation:
- square root of the variance
- shows variation about the mean
- has the same units as the original data
Most commonly used measure of variation:
standard deviation
Coefficient of variation:
- measures relative variation
- always in %
- shows variation relative to mean
- can be used to compare 2 or more sets of data measured in different units
Data typically consist of a set of scores called a ______. These scores result from the ______ taken.
- distribution
- measurements
The original measurements or values in a distribution are called ____ ____.
raw scores
How do we organize raw data?
- frequency distributions (graphing)
- the normal curve
3 steps in data analysis:
- select the appropriate statistical technique
- apply the technique
- interpret the result
In inferential statistics, there are techniques that allow us to ____ samples and then make _____ back to the target population.
- study
- generalizations
Inferential statistics are a key part of ____ _____ as they are used to test ______.
- scientific research
- hypotheses
Inferential stats techniques can be divided into 2 categories:
- those used to test relationships between or among variables in descriptive research
- those used to test differences between means in experimental research
What are the 2 techniques that are used to test relationships between or among variables in descriptive research?
- correlation
- regression and multiple regression
What are the 2 techniques that are used to test differences between means in experimental research?
- t-test
- analysis of variance
5 steps to hypothesis testing:
- state the hypothesis
- select the probability level
- calculate the test statistic
- determine the value needed to reach statistical significance (critical value)
- accent (fail to reject) or reject Ho
Stating the hypothesis consists of:
- Ho
- H1
Selecting the probability level consists of:
the probability of a chance occurrence that a researcher is willing to accept and is set before the study
Alpha probability level:
0.05
The test statistic is either ___ or ____.
- t
- r
For a relationship question, this suggests that 2 variables are ____ ____.
truly related
For a difference question, this suggest a ____ _____ exists between ….
- real difference
- groups, measures over time, treatments etc.
4 possible outcomes of hypothesis testing:
- Ho is actually true, test correctly fails to reject it
- Ho is actually false, hypothesis test correctly reaches this conclusion
- Ho is actually true, but hypothesis test incorrectly rejects it (type I error)
- Ho is actually false, but the hypothesis test incorrectly fails to reject it (type II error)
Errors in decisions are labeled ____ and _____ error.
- type I
- type II
Type I:
- researchers make the decision that a manipulation or treatment has been successful when in fact it has not been
- false positive
Type II:
- researchers make the decision that the manipulation has failed, when in reality it actually did work
- false negative
Statistics that test differences among groups allow you to determine:
- are the means significantly different?
- how meaningful is this difference? (strength of association between IV and DV)
Characteristics of t-tests:
- require interval or ration level scores
- used to compare 2 mean scores
3 types of t-tests:
- one-group t test
- two independent groups t test
- two dependent groups (correlated) t test
One group t test:
t test between a sample and population mean
Two independent groups t test:
compares the mean scores of 2 independent samples
two dependent groups (correlated) t test:
compares 2 mean scores from a repeated measures (pre to post test) or matched pairs design
4 steps of hypothesis testing (t-tests):
- “critical values of t” table
- level of significance : alpha (0.05)
- degrees of freedom (dependent, independent)
- one-tailed or two-tailed test (depends on your hypothesis)
Dependent degrees of freedom:
df = n - 1
Independent degrees of freedom:
df = n1 + n2 - 2
ANOVA =
analysis of variance
ANOVA is a commonly used statistical test that may be considered a _____ _____ of the t test.
logical extension
ANOVA tests allow researcher to compare …
two or more groups in one test rather than across several separate tests
ANOVA requires ____ or ____ level scores.
- interval
- ratio
ANOVA is used for comparing 2 or more ____ _____.
mean scores
Factorial ANOVA:
also test and interaction effect
Repeated measures ANOVA:
- use when researcher have 2 or more time points
- same people measured a few or more times
- baseline assessment, post-intervention, follow up
ANCOVA:
- analysis of covariance
- used when researchers have control variables that they have identified as important in the analysis
Inferential statistics is used to determine the _____ between 2 or more variables, and when your research question is….
- relationship
- whether or not 2 or more variables are related to one another
Correlation is a family of statistical techniques that is used to _____ the ______ between 2 or more variables.
quantify the relationship
Correlation coefficient can range from ____ to ____.
- 1.0 to 1.0
What is used as a graphic illustration of the relationship between 2 variables in correlational statistics?
scatterplot
2 correlational techniques:
- pearson product-moment correlation (r)
- spearman rank order (p) correlation (rs)
Pearson product-moment correlation (r):
- requires interval or ratio scores
- each participant has scores on 2 variables
- most frequently used
Spearman rank order (p) correlation (rs):
- nonparametric technique for use with ordinal scores
- each participant has scores on 2 variables
A correlation coefficient can tell you:
- whether a significant relationship exists between 2 variables
- whether the relationship is pos. or neg. (direction)
- the magnitude of the correlation (strength)
- how meaningful the relationship is
Correlation coefficient do not imply a ____ ____ ____ relationship between variables.
cause and effect
Chi-square test is used to examine the _____ in _____ between groups or a comparison between the ______ or ____ ____ ____ and what would be expected by chance.
- discrepancy in frequencies
- frequency
- ranked data outcome
Chi-square test compares observed frequencies of scores to either:
- expected frequencies
- each other
Chi-square test: frequencies compared to expected frequencies:
- single sample, one-way, or goodness of fit
- are he observed and expected frequencies in agreement with each other?
Chi-square test: frequencies compared to each other:
two-way or contingency table
A meta-analysis is a common and emerging technique for ____ ____.
research synthesis
Meta-analysis is a technique for research synthesis that involves:
- the ID of a problem to address
- a methodology that explains decisions for the lit review and analysis
- an analysis that integrates findings from a number of studies
- quantifies the findings in a standard metric (effect size)
5 basic steps of meta-analysis:
- define variables of interest and formulate research questions
- search the literature, and ID adequate empirical studies in a systematic way
- code previous studies and select appropriate index of effect size
- analyze the data collected from previous empirical studies
- interpret results and draw conclusions
Regardless of the ways to express data or your phenomenon, narrative (both written and verbal) of findings include:
- talking about the data
- time series of the data
Talking about the data:
is the magnitude of your findings something a stakeholder group, public policy makers, or society as a whole find important?
Time series of the data:
was it an anomaly that subsided or does the phenomenon something that persisted?