Ch2: Frequency Distributions Flashcards
The basic ingredients of a data set:
2.1 Frequency Distributions
- **Raw scores: **data that have not yet been transformed or analyzed
- Organized into a frequency distribution: describes the pattern of a set of numbers by displaying a count or proportion for each possible value of a variable
Different ways to organize data in a frequency distribution
2.1 Frequency Distributions
- Frequency tables
- Grouped frequency tables
- Histograms
Frequency tables
2.1 Frequency Distributions
visual depiction of data that shows how often each value occurred - that us, how many scores were at each value
Steps to organizing frequency tables:
2.1 Frequency Distributions
1: determining the range of raw scores (highest & lowest)
* EX: for each country, we can count how many volcanoes it has (1 vs 81)
* Outlier: an extreme score that is either very high or very low in comparison with the rest of the scores in the sample (would make frequency tables too long)
2: create two columns; label the first with the variable name (EX: number of volcanoes), and label the second “frequency”
3: list the full range of values that encompasses all the scores in the data set, from highest to lowest. Include all values in the range, even those for which frequency is 0
4: count the number of scores at each value, and write those numbers in the frequency column
Grouped frequency tables:
2.1 Frequency Distributions
- allows researchers to depict data visually by reporting the frequencies within a given interval, rather than the frequencies for a specific value
- i.e. not 1 country with 17 volcanoes
- Instead of reporting every single value in the range, we can report intervals, or ranges of values
Steps to organize a grouped frequency table
2.1 Frequency Distributions
- find the lowest and highest scores in the frequency distribution
- get the full range of data
- determine the number of intervals and the best interval size - divide range by the number of intervals we want
- figure out the number that will be the bottom of the lowest interval: we want the bottom of the lowest interval to be a multiple of the interval size
- finish the table by listing the intervals from highest to lowest, and then counting the numbers of scores in each
Histograms
2.1 Frequency Distributions
- Graph that looks like a bar graph but depicts just one variable, usually based on scale data, with the **values of the variable on the x-axis **and the frequencies on the y-axis (can be made taking data from frequency/grouped frequency tables)
- Each bar reflects the frequency for a value or interval
Difference between bar graphs and histograms:
2.1 Frequency Distributions
- Difference between bar graphs: bar graphs typically provide scores for nominal data relative to another variable, whereas histograms typically provide frequencies for one scale variable
- Bars are stacked one against the next, with the intervals meaningfully arranged from lower numbers to higher numbers
How to organize a histogram - from a frequency table:
2.1 Frequency Distributions
- Draw the x-axis and label it with the variable of interest and the full range of values for this variable
- Draw the y-axis, label “frequency”, and include the full range of frequencies for this variable
- Draw a bar for each value, centering the bar on that value on the x-axis and drawing the bar as high as the frequency for that value, as represented on the y-axis
How to organize a histogram - from a grouped frequency table:
2.1 Frequency Distributions
- Determine the MIDPOINT for every interval
- Draw the x-axis and label it with the variable of interest and include with the midpoint for each interval on this variable (include 0 unless impractical)
- Draw the y-axis, label it “frequency”, and include the full range of frequencies for this variable (include 0 unless impractical)
- Draw a bar for each midpoint, centering the bar on that midpoint on the x-axis and drawing the bar as high as the frequency for that interval, as represented on the y-axis
Normal distributions
2.2 Shapes of Distributions
a specific frequency distribution that is a bell-shaped, symmetric, unimodal curve
Skewed distributions
2.2 Shapes of Distributions
are distributions in which one of the “tails” of the distributions is pulled away from the center (AKA lopsided, off-center)
Positively skewed distribution
- the tail of the distribution extends to the RIGHT, in a positive direction
- Can occur when there is a floor effect: a situation in which a constraint prevents a variable from taking values below a certain point
Negatively skewed distribution
- the tail of the distribution extends to the LEFT, in a negative direction
- Can occur when there is a ceiling effect: a situation in which a constraint prevents a variable from taking on values above a given number
Summary data could be misleading if researchers don’t address…
- extreme data points
- any conclusions drawn from inferential statistics could be wrong
Researchers concerned with data ethics examine each _ data point, usually by looking at a graph, before analyzing their data
individual
Dot plot
- a graph that displays each data point in a sample, with the range of scores along the x-axis and a dot for each data point above the appropriate value (similar to histograms - but show individual points)
- DON’T HAVE A Y AXIS (doesn’t have a count or frequency); instead, stacks an additional dot for each data point at the same value on the x-axis
LECTURE
Proportion
Frequency tables
Frequency in value/TOTAL sample size
Cumulative frequency
Frequency tables
- Adding together the frequencies
- OG: 76, 82, 33
- CF: 76, 158, 191
Cumulative proportion
Frequency tables
- Adding up value numbers
- OG: 0.39, 0.42
- CP: 0.39, 0.82
How to distinguish histograms (3 ways)
- Interval/ratio data on x-axis
- Frequency on y-axis
- Visually - bars are touching
Frequency polygon:
- Instead of bars, there are points
- Added extra category to either side of the x-axis range, because always “anchored” to the x-axis
Cumulative proportions…
Sampling distributions foreshadow p values..
CAPTURE area under the curve of a histogram
Why make frequency distributions?
- Might use in work
- May need to interpret
- Chance to practice working with numbers
- Spend time thinking about FD of scores in table and graph format