Visual Representations of Data Flashcards
Define tabulation of scores
has a set of categories and frequency within each category (how many people got that score)
- representation of how scores are distributed over the scale
- all value are organised from highest to lowest
N = total number of observations
X = scale categories
f = scores
Xf = sum of all the scores
define histogram
- visual representation of the distribution of scores
- rectangles represent frequencies of observation at each score/interval
- only 1 show 1 set of data
- quantitative data
-has real limits
-NOT a bar graph
p (proportion) = frequency/ total # of observations = 1.0
N = total # of observation
% = p x 100% - frequency or proportion as a y value has no effect on the shape of the histogram
how to find a grouped frequency distribution table for a histogram?
- find range (high - low)
- determine the number of internal needed
- find interval width (want round, easy number): range/ # intervals
- identify the upper and lower bounds of the interval (lower must be a multiple of the interval width
- tabulate the data
- interval limits: apparent (same numbers) and real (one decimal place more precise)
define polygons
- combining multiple sets of data together
- can be used for just 1 set of data too
- better to use % when sample size are not the same
- can use frequency or percentage
define bar graphs
- nominal data: no limits (ex. gender)
- ordinal data: when intervals are not the same
how can a graph be misleading?
- the y-axis should always begin at the minimum
score and extend a little past the highest observed score - height of the blocks should be 3/4 of the width
define population distributions
smooth curve: used to depict continuous data in population to indicate frequencies are relative as opposed to exact
- central tendency, variability, and shape
how is central tendency, variability, and shape related to distribution?
central tendency: the values that capture teh center of the distribution
variability: the scores tendency to be close to the center or to spread out
shape: score are either symmetrically distributed (population normally, normal or bi-model) or skewed (to the negative or positive direction, usually samples)
define percentiles
- points in a distribution below which a given percent of the scores lie
- P88 = 88% of the scores lie below
- higher percent is better
define quartiles
divide the distribution into 4 equal parts
- Q1 = P25
- Q3 = P75
- quartiles are between the 4 25% sections
define box-and-whiskers plot
- visual representation of the 3 quartiles
- box represents interquartile range (Q3 - Q1)
- Whiskers descend from the box down the Xmin and go up to Xmax
- length of whiskers can extend to 1.5 times the box length
- outliers are scores that lay beyond the whiskers
- can see variance, centrality, symmetrically
- independent and dependent variables
define cumulative frequency/percentage
(cf) is the total number of observations equal to or below a score or interval in the frequency distribution
(c%) is the percentile rank of a score or interval in the frequency distribution
% = p(100) = f/N(100)
c% = (cf/N)(100)
define interpolation
used to estimate value within intervals using 4 steps
what are the 4 steps for interpolation?
- what is interval value (X and c%)
- calculate fraction on known scale: distance (top of interval/interval width)
- distance from unknown scale: fraction x interval width
- calculate position on unknown scale: X = upper limit - distance or percentile = upper limit - distance