Quantitative Analysis Flashcards
Coding
reorganizing numerical data into a format that is easy to analyze using a computer
Code sheet
- Raw data to grid sheet then transfer data to computer file
Direct-entry method
- As info is collected it is directly entered into a software data package
Optical Scan
- Construct a questionnaire that asks respondents or allows researchers to fill in the correct dots
Bar code
- Convert data into bar codes and use a bar code reader to transfer info into a computer
How to clean data
Code Cleaning
- checking for coding errors
- Looking for “impossible codes”
Contingency cleaning
- Check that codes that should correspond across different variables actually correspond
Frequency Distributions
Descriptive Statistics - Describe numerical data
Univariate Statistics - Describe one variable
Frequency Distribution - A table that shows the distribution of cases into the categories of one variable
Measures of central tendency
Mode - Can be used with nominal, ordinal, interval or ratio data
- distribution can have more than one mode
Bimodal - A distribution with two modes
Multimodal - Distribution with more than one mode
Measures of central tendency - Medium
Meausre of central tendancy for one variable indicating the point or score at which half the cases are higher and half are lower
Easiest way to identify the median is to organize the score from highest to lowest and court to the middle
Measures of central tendancy - Mean
The mean can only be used with interval or ratio level data
Complute the mean by adding up all the scores than divide by the number of score
Frequency distribution from a “normal” or bell shaped curve (normal distribution)
Skewed distribution - more cases are in the upper or lower scored
Problems with the mean
The mean uses all values in a sample including extremely low and high values its vulnerable to being pulled up/down and misrepresenting the values in a sample
Measures of Variation
Why care about measures of dispersion
- Reveal a great deal fo information about the differences between distributions
Range - The distance between lowest and highest scores
Range has limitations - Therefore range may exaggerate the dispersion of most scores
Percentiles - Tell the score at a specific place within the distribution
Greater clustering
Greater clustering of scores around the mean in distribution for service A indicated less dispersion
Flatter Curve
A flatter curve of the distribution for service B indicates more variety or dispersion
Standard deviation
A meaure of dispersion for one varibale that indicated an average distance between the scores and the mean
Required an interval or ratio level measurement
It increases in value as the validity of the distribution increases
Z scores
Standard deviation and the mean are used to calculate Z-scores
Because they represent standardized scores Z-scores let a researcher compare two or more distributions or groups
Bivariate Relationship
Bivaraite Statistics - Statistical measures that involve two variables
Let a researcher consider two variables together and describe the relationship between variables
Correlation - Things vary together to are associated
Independance - There is no association or no relationship between variables. If two variables are independent cases with certain values on one variable do not have any particular value on the other variable
Scattergram
Graph which a researcher plots each case of observation, where each axis represents the value of one variable
Used for variables at the interval-or ratio-level rarely for ordinal variables and never for nominal variables
Scattergram Forms
- Independance - no relationship - random pattern
- Linear Relationship - A straight line an be visualized int he middle of a mazr of cases
- Curvilinear relationship - means that the centre of a maze of cases would form a U curve, right side up or upside down, or a S curve
Direction - Linear relationships can have a positive or negative direction
Positive - line from lower left to upper right
Negative - Upper left to lower right
Percision - Bivarite relationships differ in their degree of percison
Percison is the amount of spreak in the points don’t he graph
Bivariate Tables
They are present the same information as a scattergram but in table form
Cross Tabulation - Cases are organized in the table on the basis of two variables at the same time
Contingency table - Fored by cross-tabulating two or more variables
Reading a percentaged table
If there is no relationship in a table the cell percentages look approx equal rows and columns
Measure of association
A single number that expresses the strength, and often the direction, of a relationship. It condenses information about a bivariate relationship into a single number
Statistical Control
Showing a relationship between two variables is not sufficient to say that an independent variable causes a dependent variable
To assert that a relationship exists
1. Temporal order
2. Association
3. Eliminate other explanations
Elaboration Model
Trivariate Tables - Consist of multiple bivariate tables - has a bivariate table of the independent and the depended variable for each category of the control variable, these are called partials
Partials - Tables for three variables that show the association between the independent and dependent variables for each category of a control variable
Multipl Regression
Is a multivariate statistical technique that allows us to break down the separate effects of the independent variables on the dependent variable
Resutls tell us
1. How well a set of variables explains a dependent variable
2. The regression results measure the direction and size of the effect of each variable ona. dependent variable
Inferential Statisics
Researchers need to know that the relationships they see in samples apply to populations so they can use inferential stats
Inferential statistics rely on probability theory
Statistical signifcance
Levels of significance - a way fo talking about the likelihood that results are due to chance factors - a relationshio appears in the sample when there is none in the popualtion
Type I
Occurs when the researcher says that a relationship exists when in fact none exist
Type II
Occurs when the researcher says there is no relationship when in fact there is