Correlation and Regression Analyses Flashcards
What is a model?
- Representation of some phenomenon
2. Non-Math/Stats Models
What is a Math/Stats Model?
- Often describes the relationship between variables
2. Types: Deterministic (no randomness), Probabilistic (with randomness)
Deterministic Models
- Hypothesizes exact relationships
- Suitable when prediction error is negligible
- Example: Body Mass Index (BMI) is measure of body-fat based
Metric BMI Formula
Weight in Kilograms/(Height in Meters)^2
Non-Metric BMI Formula
Weight (in pounds)* 703/(Height in Inches)^2
Probabilistic Models
- Hypothesize 2 Components
• Deterministic
• Random Error - Example: Systolic blood pressure of newborns Is 6 Times the Age in days + newborns Is 6 Times the Age in days + Random Error
- Random Error May Be Due to Factors Other Than age in days (e.g. Birthweight)
Types of Probabilistic Models
- Regression Models
- Correlation Models
- Other Models
Rationale
It is often often desirable to determine whether the scores of one distribution are related to the scores of another distribution.
Purposes of the Rationale
- to assess linear relationship between variables
- to provide an initial step for prediction
- to provide an initial assessment of possible causal relationship
- to assess test-retest reliability (of instruments)
Correlational Design
two (uncontrolled) variables at either an interval or ratio measurement scale
Scatterplot
a plot of the pairs of values of two (quantitative) variables on a rectangular coordinate plane
an effective tool for presenting possible relationships between two (quantitative) variables
Possible relationships in a scatterplot
- None
- Linear (numerically assessed by a correlation)
- Non-linear
Correlation coefficient
- expresses quantitatively the magnitude and direction of the relationship between two variables using a normalized scale (i.e., ranging from -1 to 1)
- a measure of the strength of the linear association between two variables
- any correlation coefficient has two components: (1) the sign indicates either a positive or a negative linear relationship (i.e. direction); (2) the absolute value indicates the strength of the relationship (i.e., magnitude).
Types of Linear Relationship
- Direction: +: direct linear relationship, -: inverse linear relationship
- Magnitude: closer to 1: strong to almost perfect (linear) relationship; closer to 0: weak to almost no (linear) relationship
Pearson’s R
- also known as the Pearson’s Product - Moment Correlation Coefficient
- describes the linear relationship between interval and/or ration variables
- a measure of the extent to which paired scores occupy the same or opposite positions within their own distributions
Regression Models
- Relationship between dependent variable and explanatory variable(s)
- Use equation to set up relationship
- Numerical dependent (response) variable
- 1 or More Numerical or Categorical Independent (Explanatory) Variables
- Use mainly for prediction and estimation
Simple regression
- Simple regression analysis is a statistical tool that gives us the ability to estimate the mathematical relationship between a dependent variable (usually called y) and an independent variable (usually called x)
- The dependent variable is the variable for which we want to make a prediction
- While various non-linear forms may be used, simple linear regression models are the most common
Historical Origin of Regression
- Regression analysis was first developed by Sir Francis Galton, who studied the relation between heights of sons and fathers
- Heights of sons of both tall and short fathers appeared to “revert” or “regress” to the mean of the group
Types of Regression Models
- Simple
2. Multiple
How many explanatory variables does a simple regression model have?
1 explanatory variable
How many explanatory variables does a multiple regression model have?
2+ explanatory variables