skills test 3 Flashcards

1
Q

what is the p value

A

the probability of getting a result as extreme or more extreme as the observed data if the null hypothesis is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

with what p values do we reject or not reject the null hypothesis

A

if p is larger than alpha - do not reject

if p is less than alpha - reject

if p is low, the null must go

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how do we work out p values

A

for a sample mean, t.dist or t.dist.2t

for a sample proportion, norm.s.dist and will have to manipulate and multiply by 2 for 2 tails, use the complement rule, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a type 1 error

A

when a true null hypothesis is rejected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a type 2 error

A

when a false null hypothesis is not rejected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

why are there errors in hypothesis testing

A

we never have the whole population to actually know the true population parameter. we never know if we could have just got a particularly unusual sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is an independent sample

A

This is where the selection of individuals who make up one sample is independent of the selection of individuals in the other sample.

Eg: an economic wishes to determine whether there is a difference in mean family income for households in 2 socio-economic groups.

Eg: a university wants to compare the mean NCEA results of applicants educated in rural high schools and urban high schools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a paired sample

A

Each observation in one sample can be matched/paired in a meaningful way with a particular observation in the other sample. These are things such as “repeated measurements” or before and after samples.

Eg: Nike wants to see if there is a difference in durability of 2 shoe sole materials. One type is placed on one shoe, the other type on the other shoe of the same pair. In this scenario the same person wears one of each pair.

Eg: An analyst for Educational Testing Service wants to compare the mean test scores of students before and after taking a review course. Each student in the sample is measured twice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The sampling distribution of the difference in sample means can be treated as normal if:

A
  • The original populations are both normal
    • If either of the populations is not normally distributed the sample size is large enough (n greater than or equal to 30) for the Central Limit Theorem to apply
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

for stat 101, D0 is always?

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

we also use the large sample hypothesis test for?

A

small sample tests where the variances of those two populations are unequal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

which degrees of freedom do we use when there are two samples

A

the smaller of n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what test do we use to compare variability

A

F test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

the F distribution is always __ skewed?

A

right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

the test statistic for an F distribution is given by?

A

larger sample variance/smaller sample variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how do we get df1 and df2 for an F distribution

A

the numerator of the test statistics sample size - 1 is df1

the denominator of the test statistics sample size - 1 is df2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what excel function do we use for the f test using the critical value method

A

F.INV.RT, divide alpha by 2 for a 2 tailed test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what excel function do we use for the f test using the p value method

A

F.DIST.RT, multiply this by 2 for a 2 tailed test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is the requirements for the F test

A

both populations MUST be normally distributed (no CLT here)

20
Q

for paired samples we do a ___ test?

A

t

21
Q

for a paired t test we use excel to find the difference (which we need to define) and the ___________ to do the calculations

A

averages, standard deviations

22
Q

what is degrees of freedom for a paired t test

A

number of differences - 1

23
Q

what are the requirements of the paired t test

A
  • number of differences is large enough for the CLT to apply

- the population of differences is normally distributed

24
Q

for differences in population proportions we do a ___ test

A

independent samples z test

25
Q

what are the requirements for an independent samples z test for proportions

A
  • random and independent samples
  • conditions for binomial satisfied
  • normal approximation to the binomial can be used ie. n1*phat 1 greater than or equal to 5 and same with the complement and with the 2’s
26
Q

what are the degrees of freedom for a small samples hypothesis test for the difference in independent population means assuming unequal population variances

A

smaller of (n1-1) and (n2-1)

27
Q

what are the requirements for the difference in independent population means assuming unequal population variances t test

A
  • random and independent samples
  • if a sample is small then the population from which that sample is drawn is normally distributed
  • the unknown population variances are not assumed to be equal
28
Q

if there is any doubt about whether the variances are equal, which test should you use?

A

the unequal variances formulas/test

29
Q

we can run tests through which feature in excel

A

data analysis

30
Q

what does k represent for a chi squared test

A

the number of categories or outcomes

31
Q

what are the null and alternative hypotheses for the one way chi squared test (goodness of fit test)

A

null - the underlying population proportions follow the model

alternative - they do not follow the specified model

32
Q

the chi squared distribution is what shape

A

right skewed

33
Q

the degrees of freedom for the chi squared goodness of fit test is?

A

k - 1

34
Q

what are the requirements for the chi squared goodness of fit (one way) test

A
  • The data is from a random sample of independent observations
    • For each category the expected count is at least 5
35
Q

what are the null and alternative hypotheses for the chi squared test for independence (two way)

A

null - the two variables are independent

alternative - the two variables are not independent

36
Q

what is the degrees of freedom for the chi squared independence test

A

(r - 1) x (c - 1) where r is the number of rows and c is the number of columns

37
Q

the expected value for a cell for the chi squared test for independence is given by?

A

the row total x the column total/overall total - use this as E in the equation

38
Q

what simple formula can we use in excel to get a P value for a chi squared test

A

CHISQ.TEST

39
Q

r squared is given by

A

explained sample variation/total sample variation

40
Q

zero slope means zero?

A

relationship

41
Q

zero slope means?

A

measuring the x variable will not help in predicting the y value

42
Q

the null hypothesis testing for the slope parameter beta 1 is?

A

that b1 = 0 aka the model is not useful for prediction because there is zero slope/relationship

43
Q

the alternative hypothesis testing for the slope parameter beta 1 is?

A

that b1 is not equal to 0 aka the model is useful for prediction because there is a relationship

44
Q

where do we get values for the tests for regression (test stats, p values, confidence intervals, degrees of freedom etc.)

A

the regression output

45
Q

what do we use the F test for

A

to assess the overall model - this is the same as testing if the slope is 0

46
Q

what are the 4 linear regression assumptions for the residuals

A
  • mean of the distribution of the residuals = 0
  • distribution of the residuals is normal
  • distribution of the residuals has constant variance
  • residuals are independent