Chapter 2 - Frequency Distributions Flashcards
What are raw scores?
The data that is gathered from participants. All the numbers that have not been organized or graphed or cleaned up.
WHY not use raw data?
* Finding a pattern in raw data is difficult
* We want to visualize and summarize the data
* Need to also inspect for outliers and for data entry errors.
What are the steps to create a frequency distribution table and a grouped frequency table?
- Frequency Distribution Table = a visual depiction of data that shows how often each value occurred (how many scores were at a certain value – how many students got exactly 7 hrs sleep? 5 hours of sleep?) SEE PIC BELOW for how it’s done.
-
Grouped Frequency Table: (Groups the data) 2 reasons -
1. when data has a large range of potential values (like IQ going from 70 - 149 ) see table on next card
2. When the data has decimal points (is continuous)
Principles to keep in mind for a Grouped Table:
a) you need to determine the full range of data and include the points that have zero frequency (Top Value - Bottom Value: 8 - 3.5 (then + 1) = 5.5)
b) aim for between approx. 5-10 intervals (no less than 5, no more than 15)
c) for continuous data, use lower and upper limits (the lowest and highest possible values)
Frequency Distribution Table
GROUPED FREQUENCY DATA
GROUPED FREQUENCY TABLE (the data initially)
GROUPED FREQUENCY TABLE - for continuous data
HISTOGRAM - for continuous data
What is PIE CHART?
When you want to show proportions of the whole picture.
What is a BAR GRAPH?
2nd way
Visual depictions of data when the independent variable is nominal and the dependent variable is interval (specifically, scale) :
TWO WAYS:
- Present frequency or proportion Data. EX: graph showing the % of girls and boys getting over 9 hours of sleep per night.
- Present mean or average values EX: the previous graph shows the mean score of the two variables, neutral and emotional. The black stick bars on top are ‘standard error bars’.
EX: develop a chart demonstrating the cost of tuition (dep. variable) for 3 types of schools - public, semi-public, & private (indep. variable)
1st way
What is a SCATTERPLOT?
Used to depict the relationship between 2 scale variables
ex: amount of abdominal fat & dementia symptoms
What is a HISTOGRAM?
Histogam bar graph
A histogram is a bar graph of data that shows the frequency of each value of a variable. Same info as a frequency table, but visualised differently.
What is the Biased Scale Lie?
What is the Sneaky Sample Lie?
What is an Interpolation Lie?
What is an Extrapolation Lie?
What is an Inaccurate Value Lie?
- When the choices are biased towards an outcome, such as when a scale has ‘Not Satisfactory, Good, Excellent, Truly Superior’…… and there’s no negative ratings on there! Another example is ‘Rate Toronto as 1st, 2nd, 3rd. or 4th’ and then the person reports ‘Toronto is in the top 4 cities in Canada!’. It is set up to have a biased outcome.
- sometimes there is a dichotomy amoung the data because either people had very good experiences or very bad experiences (Travel Advisor, Rate my Professor, Yelp). People self-select to participate and it’s not randomized sampling!:)
- When a line is drawn between data points that have been selectively placed on the graph
- When a line is drawn outside of the data points and the graph assumes the model line will go down, up or across.
- Uses scaling to distort the graph data. Looking at the pic below, the Tim Hortons and the Starbux uses different scales so the whole thing is hard to read at a glance! (Should start at 0 and label the scales)
All of these need to have representative sampling.
#5
What is a normal distribution?
is a graph showing the typical bell curve in the middle – meaning most of the participants scores were in the middle of the graph.
How do positively skewed distributions and negatively distributions deviate from a normal distribution?
Instead of being a ‘normal’ graph with the bell graph in the middle, there is a tail to one side. It is non-normal and non-symmetrical.
POSITIVE — generally has ‘floor’ effects
NEGATIVE — generally has ‘ceiling’ effects
What is the benefit of creating a visual distribution of data rather than simply looking at a list of the data?
to look at the shape of the distribution
What is a floor effect and how does it affect a distribution?
A situation in which a constraint prevents a variable from taking values below a certain point. Pushes the distribution to the LEFT side of the graph (positive skew)
CALCULATING STATS:
What is 63 out of 1264 in %
What is 2 out of 88 in %
What is 7 out of 39 in %
What is 122 out of 300 in %
What type of variable (nominal, ordinal, scale) are these data as counts?
What kind of variable are they as percentages?
Report these to only 2 decimal places?
1888.999
2.6454
0.0833