Lecture 10 - Data Analysis & Interpretation Flashcards
Qualitative Data
- Non-numerical
- Inductive
- Analysis = thematic analysis
Quantitative Data
- Numerical
- Deductive
- Analysis = Statistics
Inductive vs. Deductive Reasoning
Inductive reasoning:
* Starts with observations and then builds towards a theory
Deductive reasoning:
* Starts with a theory or hypothesis, then tests it through data
Qualitative Analysis
- Conceptualization and analysis beging during data collection
Multiple methods - dependent on data and research goals
* Descriptive: Summarize
* Inferential: Focus on underlying meaning
Inductive:
* Grounded theory: Focuses on allowing patterns, themes, and common categories to emerge from data
Coding Qualitative Data
Coding assigns units of meaning to the data:
* Process of organizing raw data into categories
(See worksheet from class)
Interpreting Qualitative Data
Go beyond coding the data and interpret the underlying meaning of the codes
* Develop themes based on the organization from coding
Ex.
Qualitative Response: “I always feel stressed during exams”
Coding: Stress
Theme: Emotional and Physical Impact of Academic Pressure
Quantitative Analysis
Descriptive: Summarize/describe data in a sample
Inferential: Draw conclusions about a population
Types of Statistics & Analysis
- Univariate - 1 variable
- Bivariate - 2 variables
- Multivariate - 3+ variables
Univariate
- Focuses 1 variable at a time
- Simplest form of data descrpition and analysis
- Key elements include distribution, central tendency, and dispersion
Frequency distribution
Frequency distribution:
* Summary of the frequency of individual values for a variable
- Displays number and percentage of cases that fall within variable categories
Ex. frequencies of each attribute of a variable
Frequency Distribution Histogram
A visual representation of the variable distribution
Histogram: a type of graph used to show the distribution of numerical data.
Measures of Central Tendency
- Mean - Average score
- Media - Middle score (If even number, take average of 2 middle scores)
- Mode - Most frequent score
Fact: Measures of the normal distribution
In a normal distribution:
* The mean, median, and mode are all the same value
- When they differ = a skewed distribution
Issues with relying on Measures of Central Tendency
- Mean is sensitive to extreme scores
- Mode may not be representative of tje distribution
Measures of Dispersion
- Range - Distance between highest and lowest score
- Standard deviation - How widely the scores are spread around the mean
- Percentiles: Percentages of cases that fall at or below a certain value
Range
Difference between the highest and lowest scores
(Max - Min = Range)
Standard Deviation
An estimate of how widely the scores are spread around the mean.
* The larger the standard deviation (SD), the larger the dispersion
Percentiles
Indicates the percentage of cases that fall at or below a certain value.
Ex. LSAT raw vs. percentile ranking:
Grouped into quartiles
* 0-25%
* 26-50%
* 51-75%
* 76-100%
When to use Measures of Central Tendency & Dispersion
Not appropriate for all variable types:
* Discrete: nominal & Ordinal measures
* Continuous: interval & Ratio measures
Discrete vs. Continuous variables
Discrete: one that can take specific, separate values—usually countable.
Ex. # of students in class (cant have 22.5)
* Raw numbers & percentages
Continuous: can take any value within a range—including fractions and decimals.
Ex. Weight - 65.3 kg
* Use median, mean, dispersion measures
Rates
Rates are a standardized measure that allows for comparison between groups
* Ratio - typically time or per-unit measure
Ex. Graduation rate
1000 students, 85 graduated; 85% graduation rate
Bivariate Description & Analysis
- Focuses on 2 variables
- Goal is to describe relationship between the 2 variables
- Includes description and measures of association
Bivariate contingency tables
Summarizes and compares two variables together:
* Values of one variable are contingent to another
Contingent: dependent on something else happening
Measures of Association
Describe associations that connect one variable to another.
* Based on proportionate Reduction of Error (PRE):
* How much variation in y can be predicted by x; how much can you reduce your error in predicting y by knowing x
Calculation used is dependent upon different levels of measurement
Nominal Variables:
* Calculated by lambda (based on ability to guess values on one of the variables)
Ex. Gender, marital status, race
Lambda: the average rate of events happening in a fixed time or space.
Buses at bus stop 4 times an average each hour
Lambda = 4
Ordinal Variables:
* Same as lambda, but accounts for ordinal nature of the values
Ex. Education level, level of agreement, SES
Interval/Ratio Variables:
* Calculated by Pearson’s product-moment correlation (r)
Ex. Age, number of arrests