Terms and Linear Reggresion Flashcards
Response / dependent variable
The variable of interest, dependent on other explanatory variables.
Explanatory / independent variables.
Other variables that are used to explain the behaviour of the response variable.
Factor
Is a discrete explanatory variable
What are covariates
Continuous explanatory variables
Continuous variable
Things that can be counted infinite number of times such as age, measurement, temperature ect (although age can be converted to discrete if you seclude it to years).
Discrete variable
A variable you can count a finite amount of time. Counting the change in your pocket.
When a researcher manipulates the explanatory variables (treatments) while holding the other variables constant and notice the consequences of the response variable.
Designed experiments
Cause and effect relationships can be concluded
Researchers observe the differences in explanatory variables and see if these are related to differences in the response variable.
Observational studies
Cannot be certain on cause and effect relationships.
Confounding factors
Factors that affect both the dependent and independent variables.
Correlation coefficient
r=
-1 to 1 score of linear relationships
0 = no linear relationship
1 = high positive relationship
-1 = high negative relationship
Beta 0 =
The intercept
Beta 1 =
The slope (change in Y when x is increased by 1 unit).
Residual
Vertical distance between a data point and the regression line. Often named errors.
Response Variable/s:
response variable is also referred to as a dependent variable.
Explanatory Variable/s:
These are variables that are used to explain the effect
that we see in the response variable.
independent variables.
physical elements that
reflect the design; for example, plots, animals, pens, test tubes or plants. There will also be the treatment elements. These could be diets, varieties, pasture types
etc.
Predictor Variable (x)
Independent or explanatory variables which seek to predict or ‘explain’ the response variable.
Dependent Variable (Y ):
Response variable of interest to investigators that is (hopefully) dependent on the explanatory variable/s. This is always a random variable.
dependent variable goes on the __ axis
Y
independent variable always sits on the __ axis
X
calculate rXY (rcmdr)
Statistics / Summaries / Correlation Matrix , then press “OK”, then select the variables whose correlation coefficient(s) you want to find
Yi = β0 + β1xi + εi
simple linear reggression model.
The random error term is assumed to:
follow a normal distribution,
have a mean of zero: E(εi) = 0,
have a constant variance: Var(εi) =σ2
be uncorrelated with the other error terms.
a simple linear regression
model (Rcmdr)
Statistics / Fit Models / Linear
Regression
correlation coefficient is between;
-1 and +1
B1 = slope, the change in;
y when x is increased by 1.
smaller sum of squares indicates;
better fitting line.
The coefficient of determination
a measure of the variability in the response variable that is explained by the predictor variable. r2
hypothesis testing for B1 =
H0: B1 = 0
H1: B1 not=0
The coefficient of determination (R2)
a measure of the variability in the response variable that is explained by the predictor variable x2
curvilinear
contained by or consisting of a curved line or lines.
when to use a polynomial reggression model?
You would use a polynomial regression model if the aim was to define the relationship between two continuous variables, as you would do for simple linear regression.
The difference is that the relationship between the response variable and the predictor variable is suspected to be curved, rather than being linear.
process of fitting a linear reggression model 5 steps:
- Graph the data to determine the type of relationship
- Fit an appropriate model
- Test the assumptions
- Check the fit of the model
- Draw conclusions
polynomial function: parabola
a polynomial function where the highest power is a squared (2) term (for the parabola)
polynomial function: cubic
a polynomial function where the highest power is a cubed (3) term (for the cubic)
B0 (beta zero) =
Value of y when x is equal to 0.
Residual is
Calculation of distance between line of best fit and points.
A smaller sim of squares indicates what?
A better fitting line.
Best graph to use for 2 continuous variables?
Scatterplot
Simple linear regression model analysis has 3 major purposes.
- To describe the linear relationship between X and Y
- To determine how much of the variation in Y can be explained by the linear relationship.
- To predict new values of Y from given values of X.
Coefficient of determination
A statistical measurement that examines how differences in one variable can be explained by differences in another when predicting the outcome of an active vent.