Statistics Lecture - Dr Wofford Flashcards
Linear Regression, Assumptions
Assumptions:
- Predictor variable(s) (IV) must be quantitative (and continuous?)
- Outcome variable (DV) must be quantitative, continuous
- No perfect multicollinarity: predictor variables should not correlate highly (should not be the same?)
- Outliers
- Big problem with regression- must examine causal reasons
There are more assumptions, but these are outside the scope of this class
Predictor variable(s) is the same as
IV
What test can you do for multivariate? (more than one DV)
MANOVA (or MANCOVA if there is covariates)
Draw a comparison of parametric and nonparametric statistics chart.
what is a really common reason for an outlier?
mistake in data entry
What is a coefficient table?
Used in Linear Regression for prediciton statistics
Coefficient table shows how good each predictor value was. Usually there will be a relationship between strength of individual predictor variables and all of them as a whole (a model).
Do you want r2 to be high or low?
What value is acceptable?
High
0.5 is clinically ok
Great statistician will say 0.8 and above is good
Dr. Lake will probalby say 0.3 and above
Three types of Regression
•Types:
- •Linear regression
- 1 continuous DV and 1 continuous IV
- •Multiple linear regression
- 1 continuous DV and 1+ continuous IVs
- •Logistic regression
- 1 categorical DV and 2+ IVs on any scale
Correlation coeeficients:
Correlation Coefficients are used to quantitatively describe relationship between the two variables in terms of strength and direction
- Range from -1.00-0.00-+1.00
- Are sensitive to sample size
- With a large enough sample size, two variables will be statistically correlated, but may not be meaningful
- (r) denotes a correlation coefficient
Correlations
BIGGEST THING IS THEY MUST BE BIVARIATIE: Can only ook at the correlation between two variables, ONLY TWO!!
- •“What is the relationship between A and B?” is research question. Magnitude and strength of relationship.
- •How do (x) and (y) relate to each other- bivariate correlation?
- •Can be graphically displayed with scatter plot
- •Can be applied to paired observations on two different occasions or to one variable measured on two different occasions
- •Is not causal in nature- can state a relationship exists, but is not cause and effect
- How strong is the relationship?
Linear Regression: Basics except assumptions
- Examination of two variables, X and Y, that are linearly related
- X= IV; Y= DV
- Uses a scatter plot to assess for the linear regression line
- Line which best describes the orientation of all data points in the scatter plot
- If the data was perfectly correlated, all data points would fall along a straight line
What is a moderate correlation level?
Will give us an r and a p-value. Correlation and significance (likelihood it could have occurred by chance)
0.5 is a moderate correlation
0.01 p-value, means it is statistically significant and is more than could have happened by chance.
What is Coefficient of Determination (r2):
Used in Linear regression for correlations
percentage of the total variance in the Y scores which can be explained by the X scores
Parametric correlation coefficient
pearson’s r
How does the amount of predictor variables affect the required sample size?
The more predictor variables, the more people you need
Usually need about 30 subjects per predictor variable