Chapter 1 Flashcards
What is a functional relation?
X-Independent variable and Y-Dependent variable then the functional relation is Y=F(X)
What is Regression Analysis?
Stat methodology that utilizes the relation between two or more quantitative variables so that a response/outcome variable can be predicted from the others.
What is a Statistical Relation?
Not exact like a functional relation. In example, book used scatter plot to show a linear trend…there is variation in the points on the graph. Relation could be curvilinear (not linear).
What is a regression model?
formal means of expressing the two essential ingredients of a statistical relation:
- A tendency of the response variable Y to vary with the predictor variable X in a systematic fashion
- A scattering of points around the curve of statistical relationship
2 characteristics embodied in a regression model by postulating that:
- There is a probability distribution of Y for each level of X
- The means of these probability distributions vary in some systematic fashion with X.
What is a major consideration in the selection of Predictor Variables during the construction of regression models?
The extent to which a chosen variables contributes to reducing the remaining variation in Y after allowance is made for the contributions of other predictor variables that have tentatively been included in the regression model.
List some other considerations in selecting Predictor Variables.
- The importance of the variable as a casual agent in the process under analysis
- The degree to which observations on the variable can be attained more accurately, or quickly, or economically than on competing variables
- The degree to which the variable can be controlled
How is the scope of a model determined?
Either by the design of the investigation or by the range of data at hand.
What are the 3 major purposes of Regression Analysis?
- Description
- Control
- Prediction
T/F: No matter how strong the statistical relation between X and Y, there is a cause-effect pattern implied by the regression model.
F: No matter how strong the statistical relation between X and Y, no cause-and effect pattern is necessarily implied by the regression model
Define the basic regression model:
Y_i = B_0 + B_1X_i + e_1
note: Y_i is the value of the response variable in the i-th trial, B_0 and B_1 are parameters (Beta), X_i is a known constant, namely, the value of the predictor variable in the i-th trial, e_i is a random error term with mean E[e_i]=0 and variance V[e_i]=(simga)^2
Why is this model: Y_i=B_0+B_1X_i+e_i called a first order model?
The regression model is said to be simple, linear in the parameters, and linear in the predictor variable. Simple because there is only one predictor variable, linear because no parameter is squared, multiplied or divided by another parameter.
E[Y_i] where Y_i = simple regression model
E[Y_i] = E[B_0+B_1X_i+e_i] = B_0 + B_1X_i + E[e_i] = B_0 + B_1X_i
because E[e_i]=0
V[Y_i] =
(sigma)^2
because the error term e_i has constant variance, (sigma)^2
T/F: Since the error terms, say e_i and e_j, in a regression model are assumed to be uncorrelated then Y_i and Y_j (any two responses) are also uncorrelated.
T
What are the regression coefficients in a simple regression model?
B_0, B_1 are the parameters
What is B_1 in the simple regression model?
Slope of the regression line. It is the change in the mean of the probability distribution of Y per unit increase in X.
What is the parameter B_0 in the simple regression model?
Y intercept of the regression line.
What happens when the scope of the simple regression model includes X = 0?
B_0 gives the mean of the probability distribution of Y at X = 0.
What does B_0 mean when the scope of the model does not cover X = 0?
B_0 does not have any particular meaning as a separate term in the regression model.
How is observational data obtain?
Observational data is obtained from non-experimental studies. Theses studies do not control the explanatory or the predictor variable(s) of interest.
What is one major limitation of observational data?
They often do not provide adequate information about cause and effect relationships.
When control over the explanatory variable(s) is exercised through random assignments the resulting experimental data provide much stronger information about cause and effect relationships than observational data. Why?
The reason is that randomization tends to balance out of any other variables that might effect the response variable.
T/F: Control over the explanatory variable(s) consists of assigning a treatment to each of the experimental units by means of randomization.
T