Lecture 9 Flashcards
Statistics are an _____ way of interpreting a collection of _____. In other words, it is how we ____ data we have collected.
- objective
- observations
- analyze
2 types of statistics:
- descriptive techniques
- inferential techniques
5 purposes in selecting tools for analysis:
- to describe
- to compare
- to associate
- to predict
- to explain
How is selecting tools for analysis to describe?
What are the characteristics of some groups or groups of people (eg. standard deviation)
How is selecting tools for analysis to compare?
are two or more groups the same or different on some characteristic? (eg. t-test)
How is selecting tools for analysis to associate?
are 2 variables related and what is the strength of this relationship? (eg. correlation coefficient)
How is selecting tools for analysis to predict?
can measures be used to predict something in the future? (eg. regression)
How is selecting tools for analysis to explain?
given some outcome or phenomenon, why does it occur? (eg. structural equation modeling)
2 broad types of numerical data:
- continuous
- discrete
Continuous data:
measurement theoretically possible at any point along a continuum
Discrete data:
limited to a specific number of values
Descriptive stats is used to:
- organize
- simplify
- summarize the collected data
2 ways of describing data numerically:
- central tendency
- variation
Central tendency consists of:
- arithmetic mean
- median
- mode
Variation consists of:
- range
- interquartile range
- variance
- standard deviation
- coefficient of variation
Measures of central tendency indicates the _____ around which scores tend to be _____.
- points
- concentrated
Mean:
- the sum of scores divided by the number of scores
- used with interval or ratio data
Most common measure of central tendency:
mean
Median:
- middle score
- used with ordinal data
Mode:
- most frequent score
- used with nominal/categorical data
Measures of variability describes data in terms of its _____ or _______.
- spread
- heterogeneity
Easiest measure of variability:
range
Range:
- difference between the highest and lowest score
- ignores the distribution of data and is sensitive to outliers
Interquartile range:
- eliminate some outlier issues
- eliminating high and low valued observations and calculate the range of the middle 50% of the data
Variance:
average of squared deviations of values from the mean
Standard deviation:
- square root of the variance
- shows variation about the mean
- has the same units as the original data
Most commonly used measure of variation:
standard deviation
Coefficient of variation:
- measures relative variation
- always in %
- shows variation relative to mean
- can be used to compare 2 or more sets of data measured in different units
Data typically consist of a set of scores called a ______. These scores result from the ______ taken.
- distribution
- measurements
The original measurements or values in a distribution are called ____ ____.
raw scores
How do we organize raw data?
- frequency distributions (graphing)
- the normal curve
3 steps in data analysis:
- select the appropriate statistical technique
- apply the technique
- interpret the result
In inferential statistics, there are techniques that allow us to ____ samples and then make _____ back to the target population.
- study
- generalizations