Hypothesis testing 3 Flashcards

1
Q

how do you fit a regression line to data by the least squares method?

A

Equation of line: Y = a + bX, where:
a = Y-axis intercept
b = slope of line
Conditions of ‘least squares method’:

line passes through the centre of the cluster of points.

Sum of distances, d, from fitted line must
be zero (i.e. ∑d = 0).
d = y1 - yL where:
- y1 = actual Y value of any
datum point
- yL = corresponding value
of Y on fitted line

sum of squares of the distances ( ∑d^2 ) must be as small as possible.

Slope of line (regression coefficient):
b = {∑ xy - (∑x∑y/n)}/{∑x^2 - (∑x)^2/n}
where:
n = number of pairs of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the assumptions of linear regression?

A

Three assumptions:

For each X there is a normal distribution of Y from which the sample
values of Y are drawn at random.

The normal distribution of Y corresponding to a specific X value has a
mean that also lies on a straight line termed the population regression
line.

Deviations of the points from the fitted line are normally distributed
with zero means and constant variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how do you test the goodness-of-fit to the line(coefficient of determination)?

A

r = correlation coefficient

r^2= coefficient of determination
- proportion of variance of
Y attributable to the
linear regression on X
- provides estimate of
strength of relationship

regression line commonly accounts for only a small amount of the variation in
Y thus leaving much of the variation to be explained by other variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you check the goodness-of-fit to the line (Analysis of variance)

A

determines statistical significance of the line rather than the strength of the relationship of the two variables and is therefor a test of the null hypothesis that any observed relationship occurred by chance

The total variation:
Yss = ∑ y^2 - (∑ y)^2/n where:
n = number of pairs of observations

Linear effect = {∑xy - (∑x)(∑y)/n}^2/{∑x^2 - (∑x)^2/n}
portion of variance accounted for by the line

Error Variance = Yss - Linear effect
deviation from the line

Linear effect and error variance are compared using variance ratio (F) test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you check the goodness-of-fit to the line (t-test of the slope of the line)?

A

Ratio of slope to its standard error (t):
t =[ √ mean square of error effect]/ [∑x^2- {(∑x)^2/n}]

where:
mean square error is taken from the analysis of variance table
n = number of pairs of observations

P value obtained from t tables with n - 1 degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is chi square distribution?

A

A variable has a chi-square distribution if its
distribution has the shape of a special type of
right-skewed curve.

Properties of 𝜒^2 – Curves:
- The total area under a -
curve equals 1.
- x^2 -curve starts at 0 on
the horizontal axis and
extends indefinitely to
the right, approaching,
but never touching,
the horizontal axis.
- x^2 -curve is right
skewed.

As the number of degrees of freedom becomes larger, 𝜒^2
-curves look increasingly like normal curves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the equation for the chi-squared distribution?

A

𝝌^𝟐 = 𝜮 (𝒇𝒐 − 𝒇𝒆)^2/𝒇𝒆
where:
fo = observed frequencies
fe = expected frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Chi-square goodness of fit test: interpretation*look at card

A

Look at slide 11-13 lecture 9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly