Things To Remember Flashcards
Broken line graph
Time dependent numerical data
Indicating trends
Bar graph
Discrete data
Bars equally set apart
Typically nominal categories
Circle graph
Displays data as percentage
Only discrete data
Stem and leaf plot
For large series of numbers
Show specific numbers writhing groups
See which is the biggest category
Box and whiskers plot
Indicates spread of data
Shows medians
Shows interquartile range
Histogram
Frequency diagram
Continuous data
No separation between bars
Equal intervals of measurement
Relative frequency
Shows frequency of a data group as a fraction or percent of the whole data set. All relative frequencies together should add up to 100
Coefficient of correlation
r
A number from -1 to +1 that gives the relative strength and direction of the relationship between two variables.
It is the average of the z scores of the 2 variables
Coefficient of determination
r squared
A number between 0 and 1 that gives the relative strength between two variables. It tells you what percent of the variation of the dependent variable is due to the variation in the independent variable.
Regression
A process of fitting a line or curve to a set of data
Residual
R
The vertical distance between a data point and a line of best fit.
Cross-sectional study
A study which samples different groups of a population at the same time
Longitudinal study
A study that looks at the same individuals over time
Information question
Circle the correct response
Checklist question
Check all of the following that apply
Ranking questions
Rank the following in order of importance
Rating questions
How would you rate on a scale of….
How do you avoid bias in survey questions?
Simple, relevant, specific, readable Avoid jargon and abbreviations Doesn't lead respondents Not open to interpretation Brief as possible
Simple random sampling
All selections equally likely
Ex. Pulling names out of a hat
Systematic random sampling
Sample a fixed percent of population using some random starting point and select every nth individual
N=population/sample size
Stratified random sampling
Population is divided into groups called strata. A simple random sample is taken of each of these with the size of the sample determined by the size of the strata
Cluster random sampling
Population is ordered in terms of groups. Groups randomly chosen for sampling and then all members of the chosen groups are surveyed.
Multi-stage random sampling
Groups randomly chosen from a population and the individuals in these groups are then randomly chosen to be surveyed.
What are the 4 types of bias?
Sampling bias
Non-response bias
Household bias
Response bias
Sampling bias
Chosen sample does not accurately represent the population
Non-response bias
Likely only a few people who actually received questionnaire would return it
Household bias
Over representation of a particular group
Response bias
Factors in the sampling method that influence the result
Symmetrical distribution
Mode=median=mean
Bimodal distribution
Two modes
Left skewed distribution
(Tail is on the left)
Mode>median>mean
Right skewed distribution
(Tail is on the right)
mode<mean
Uniform distribution
Looks like a straight line
Bin width
Histogram interval size
Range data/number of bins
Deviation
The distance of a data point from the mean
Spread
How widely data is dispersed
Less spread= greater confidence that values will fall within a particular range
Range
Difference between the largest and the smallest value
Quartile
One of three numerical values that divide a group of numbers into 4 equal parts
Interquartile range
Range between 1st and 3rd quartiles
Deviation
Mean-x
Variance
Measure of spread found by averaging the squares of the deviation calculated for each piece of data
Standard deviation
Square root of variance
Useful measure of spread
What percentage of data will be within 1 standard deviation of the mean?
(In a normal distribution)
68%
What percentage of day will be within 2 standard deviations if the mean?
(In a normal distribution)
95%
What percentage of data will be within 3 standard deviations of the mean?
(In a normal distribution)
99.7%
Z-score
The number of standard deviations the piece of data is below or above the mean.
Can be used to find percentile