Statistics Lecture - Dr Wofford Flashcards
Linear Regression, Assumptions
Assumptions:
- Predictor variable(s) (IV) must be quantitative (and continuous?)
- Outcome variable (DV) must be quantitative, continuous
- No perfect multicollinarity: predictor variables should not correlate highly (should not be the same?)
- Outliers
- Big problem with regression- must examine causal reasons
There are more assumptions, but these are outside the scope of this class
Predictor variable(s) is the same as
IV
What test can you do for multivariate? (more than one DV)
MANOVA (or MANCOVA if there is covariates)
Draw a comparison of parametric and nonparametric statistics chart.

what is a really common reason for an outlier?
mistake in data entry
What is a coefficient table?
Used in Linear Regression for prediciton statistics
Coefficient table shows how good each predictor value was. Usually there will be a relationship between strength of individual predictor variables and all of them as a whole (a model).
Do you want r2 to be high or low?
What value is acceptable?
High
0.5 is clinically ok
Great statistician will say 0.8 and above is good
Dr. Lake will probalby say 0.3 and above
Three types of Regression
•Types:
- •Linear regression
- 1 continuous DV and 1 continuous IV
- •Multiple linear regression
- 1 continuous DV and 1+ continuous IVs
- •Logistic regression
- 1 categorical DV and 2+ IVs on any scale
Correlation coeeficients:
Correlation Coefficients are used to quantitatively describe relationship between the two variables in terms of strength and direction
- Range from -1.00-0.00-+1.00
- Are sensitive to sample size
- With a large enough sample size, two variables will be statistically correlated, but may not be meaningful
- (r) denotes a correlation coefficient
Correlations
BIGGEST THING IS THEY MUST BE BIVARIATIE: Can only ook at the correlation between two variables, ONLY TWO!!
- •“What is the relationship between A and B?” is research question. Magnitude and strength of relationship.
- •How do (x) and (y) relate to each other- bivariate correlation?
- •Can be graphically displayed with scatter plot
- •Can be applied to paired observations on two different occasions or to one variable measured on two different occasions
- •Is not causal in nature- can state a relationship exists, but is not cause and effect
- How strong is the relationship?
Linear Regression: Basics except assumptions
- Examination of two variables, X and Y, that are linearly related
- X= IV; Y= DV
- Uses a scatter plot to assess for the linear regression line
- Line which best describes the orientation of all data points in the scatter plot
- If the data was perfectly correlated, all data points would fall along a straight line
What is a moderate correlation level?
Will give us an r and a p-value. Correlation and significance (likelihood it could have occurred by chance)
0.5 is a moderate correlation
0.01 p-value, means it is statistically significant and is more than could have happened by chance.
What is Coefficient of Determination (r2):
Used in Linear regression for correlations
percentage of the total variance in the Y scores which can be explained by the X scores
Parametric correlation coefficient
pearson’s r
How does the amount of predictor variables affect the required sample size?
The more predictor variables, the more people you need
Usually need about 30 subjects per predictor variable
MANOVA (and MANCOVA)
•MANOVA: Comparison of means between >2 groups when there is >1 DV
- •Works best when DVs are moderately correlated (2 DVs on the same measurement scale)
- •ie: 180 and 360 speeds for isokinetic ER- 2 DVs which are similar because they are on the same measure (isokinetic ER)
•Types:
- •One way MANOVA: 1 IV, 1+ DV
- •Factorial MANOVA: 1+ Ivs, 1+ DVs
- •MANCOVA: 1 IV, 1+ DV, covariate
- •Factorial MANCOVA: 1+ Ivs, 1+ DVs, covariate
- •Doubly Multivariate: Repeated measures MANOVA
Prediction statistics uses
regression tests
What will correlation tests give us?
Will give us an r and a p-value. Correlation and significance (likelihood it could have occurred by chance)
For example:
- 5 is a moderate correlation
- 01 p-value, means it is statistically significant and is more than could have happened by chance.
Outcome variable is the same as
DV
Two most common types of correlations
Most common types of correlations (correlation coefficients):
- Pearson product-moment coefficient of correlation (r)
- Used when both variables (x and y) are continuus variables with underlying normal distributions on the interval or ratio scales
- Can be subject to a test of significance to determine if the observed value could have occurred by chance
- Will yield a p value which will be compared to the alpha level
•Spearman rank correlation coefficient (rs) or Spearman’s rho
- •Nonparametric version of Pearson product-moment coefficient
- •Used with ordinal data
- •Also is subject to a test of significance and yields a p value
Regression
Used for prediction statistics
- Prediction of outcomes and characteristics
- •Explains and predicts quantifiable clinical outcomes
- •Types:
- •Linear regression
- •1 continuous DV and 1 continuous IV
- •Multiple linear regression
- •1 continuous DV and 1+ continuous IVs
- •Logistic regression
- •1 categorical DV and 2+ IVs on any scale
- •Linear regression
Linear Regression, what you get out of it:
- Coefficient of Determination (r2): percentage of the total variance in the Y scores which can be explained by the X scores
- Measure of proportion (ranges from 0-1.00)
- Regression output: (page 558)
- Model Summary table: provides r2
- ANOVA table: assess the model as a whole to see if the regression model is good (how good is that regression model)
- Coefficent table: individual predictors have beta weights (B)- how important of a predictor variable they are
- T stat in the coefficient table assesses the significance of each predictor variable
- T stat and F stat will be similar (both non significant or significant) if there is only one IV
nonparametric correlation coefficient
spearman’s rho
