Inferences Flashcards

1
Q

What are the quantile values for a normal distribution? (95%)

A

+/- 1.96

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the purpose of the statistic?

A

estimating the unknown parameter μ by using sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the standard error do?

A
  • tell us the typical estimation error (x̄ -μ)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we know if our estimation is accurate?

A
  • no bias
  • precision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is s (sample SD) used?

A

Sigma is unknown and can’t be computed
only one sample size available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a confidence interval?

A
  • Range of plausible values for parameter
  • more likely to capture true value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is a confidence interval created?

A

Deciding on a confidence level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a confidence level?

A

0-1 specified by researcher
- larger confidence level = larger confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why do we not use -1.96 when estimating?

A
  • unsure about sigma’s true value -> distribution references no longer normal
  • meaning higher distribution in tails because of uncertainty
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is a T-distribution used?

A
  • higher probability in the tails due to extra uncertainty of not knowing sigma.
  • similar to standard distrbutuion
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

For a t-distirbution the confidence interval depends on…

A

-/+ t* a.k.a t(n-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What happens if you have a confidence interval of 95%?

A

95% of the intervals will contain the true parameter value
5% won’t contain the true parameter value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are research hypothesis used for?

A
  • testing a claim about population parameter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the null hypothesis?

A
  • Skeptical claim that nothing has changed
  • Assumed to be true before experiment
  • No change, no effect or relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the Alternative hypothesis?

A
  • Claim need to find evidence for
  • if H1 wanted, H0 need convincing to change
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does it mean if H0 is true?

A
  • very unlikely for random sample to give value of statistic in support of H1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When is the result a fluke rather than H1 being true?

A

if it’s very likely for value statisitc to give sample statistic when H0 is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a null distirbution?

A

Distribution of t-statistic assuming H0 to be true.
The value expected to be seen when H0 is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What forms the sampling distribution?

A
  • the possible values of sample means and their probabilities as they vary sample to sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the t-statistic?

A

Standardising the null distribution(& sample mean) using SE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does the t-statistic measure?

A
  • How many standard errors away x̄ is from μ0
  • compares difference between sample and hypothesised mean to expected variation in means due to random sampling.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are you likely to see if μ is truly μ0?

A

t-score between -2 and +2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the difference between tobs and tstar?

A

tobs = the observed t-value, t-statistic
tstar = t(n-1) the confidence interval component

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How does the presumption of innocence relate to hypothesis testing?

A
  • We always assume null hypothesis to be true before experiment
  • if considering H1(alternative) to be true collect clear evidence against null to reasonably reject
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the test of significance?
- procedure for testing claim about population parameter
26
What is the process of the test of significance?
- weight evidence against null hypothesis - evidence in statistics correspond to numerical summary of sample data - evidence beyond reasonable doubt
27
What is a p-value?
- measures the strength of evidence against null hypothesis - uses probability to say how strong the evidence is
28
What does the P-value show?
- probability of having value of t-statisitc at least equal to that observed (assuming H0 true) - p-value is the probability, of obtaining a value of the t-statistic at least as extreme as that observed.
29
How is the direction of the extreme for p-value determined?
- by direction specified by H1 > find probability of larger t scores than observed < find probability of smaller t-score than observed ≠ uses both tails
30
When is the p-value against H0? Why?
- the smaller the p-value = stronger evidence against H0 - observed result unlikely to occur
31
When does the p-value fail to provide evidence against H0? Why?
- Large p-values fail to provide evidence against H0 - Very likely to occur
32
What is the significant level for?
Standard evidence against H0
33
What does it mean when 'p-value > 0.1'?
- little to no evidence against H0
34
What does it mean when 'p-value < 0.01'
very strong evidence against H0
35
What does it mean when the p-value < α?
data significantly sigificant at level α -> reject H0 & accept H1
36
What does it mean when p-value > α?
data not significantly significant at α level -> Don't reject H0
37
What are the 2 goals of collecting sample data?
- estimation: estimate population parameter - hypothesis testing: testing whether hypothesised parameter is plausible
38
What is used in estimation?
- average of observations -> x̄ estimate - SE measures precision of estimate - confidence interval -> range of plausible values for population mean
39
What happens when the confidence level is increased?
The confidence interval is wider
40
How is the goal of hypothesis testing achieved?
- computing the test statistic, measuring the distance between sample data and hypothesised null
41
what is the null distribution of t-distribution?
t-statistic
42
when H0 true t-statistic follows... Why?
t(n-1) distribution sample mean will fluctuate around hypothesised mean if true (0) the null distribution shows possible distances between x̄ and μ0 when H0 true
43
When is H0 doubted?
when observed sample gives t-statistic that's unlikely
44
How is the t-statistic computed in R?
n <- nrow(data_sample) s <- sd(data_sample$x) se <- s/sqrt(n) mu0 <- 20 tobs <- (xbar - mu0) / se tobs
45
How is the p-value for both sides computed?
pvalue <- 2*pt(abs(tobs), df = n-1, lower.tail = FALSE)
46
What does it mean when a twice the area to the right t-statistic is observed?
- Probability of observing t-statisitc at least same distance from 0 as observed statistic when H0 is true
47
How do we make a decision for the P-value?
If p-
48
What are the critical values?
areas that cut area of significance either to the left or the right (or both). (-t* & +t*)
49
When do we reject H0 in the critical value?
when t ≤ -t* t ≥ + t*
50
What is the relationship of the Confidence interval and μ?
Confidence interval tells us the value we hope to include μ
51
What is the R code for two-tailed p-value?
2 * pt(abs(tobs), df = n-1, lower.tail = FALSE)
52
When do we reject H0 in p-value?
Reject H0 if p ≤ α
53
When do we fail to reject H0?
Do not reject H0 if p > α
54
What is the code for the upper and lower critical value?
qt(c(0.025, 0.975), df = n-1)
55
When do we fail to reject H0 in a critical value?
- When t is greater than -t* but smaller than +t* -t* < t < +t*
56
What does tobs mean?
value of t-statistic for observed sample
57
What does the 95% confidence interval tell us?
- range of values for parameter NOT rejected at 5% significance level in two-sided hypothesis test
58
What is the difference between statistical significance and importance?
- sample results unlikely observed just from random variation due to random sampling when H0 = true in population - importance = practical importance
59
Why do we need both the confidence interval and hypothesis?
- p-value provides strength of evidence against H0 -> smaller p-value = stronger evidence - Confidence interval = magnitude of population parameter
60
What is a type I error?
Rejecting a true null hypothesis
61
What is a type II error?
not rejecting a false null hypothesis
62
What is the worst error in hypothesis testing and why?
- Type I error - Null hypothesis is true but is being rejected (Only reject when evidence = beyond reasonable doubt)
63
What does the type I error correspond to in significance testing?
Alpha α the distribution of rejecting H0 when H0 is in fact true
64
In the type I error in significance testing what do the shaded areas mean?
- probability of rejecting H0 in the distribution where H0 is true
65
What does the type II error correspond to in significance testing?
- Beta
66
67
What is β calculation?
β = distribution of NOT rejecting H0 when H0 is false
68
What does the β calculation mean?
- Probability of not rejecting H0 in the distribution of alternative when H1 is false as, T-statisitc in between two critical values
69
What does the power calculation mean?
Probability of correctly rejecting false null hypothesis Rejecting H0 when null is false
70
What is the power calculation?
Power = 1 - β = P(rejecting H0|H0 is false)
71
What is the effect of increasing the sample size in significance testing?
n increases as σ decreases -> distribution becomes narrower Power increases and B decreases -> more probability to the right
72
What is the effect of changing the alternative hypothesis in significant testing?
^ distance from alternative and null hypothesis -> increase in power as more probability to the right of critical value
73
What is the effect of increasing the σ in significance testing?
- Larger spread in sampling distribution -> less probability beyond critical value and more in between - power decreases and probability of B increases -> wider distribution means more probability in-between two critical values
74
What is the effect of decreasing σ in significant testing?
- probability beyond critical values increases -> power increased
75
What is the effect of increasing α from 0.05 to 0.1?
Critical values shift inwards; more probability in tails - power increased and B decreased
76
What else affects power?
- power ^ as sample size ^ - power ^ by making α larger - power ^ true value further away from Mu0 in H0
77
How is a Type I error minimised?
- decreasing α making it harder to reject H0 but increases probability of making type II error
78
How is a type II error decreased?
- choosing larger α -> easier to reject H0 - ^ chance of type I error as true H0 may be reduced
79
What is effect size?
- Magnitude of difference between true μ (xbar) and hypothesised μ0 - shows statisitcal significance betweenxbar and μ0
80
What does a bigger effect size mean?
the bigger the distance = bigger the power
81
How is the effect size measured?
Cohen's D
82
What does Cohen's D do?
-reports how many standardised deviations lies between the two means
83
What is a small, medium and large effect size/Cohen's D?
Small - <0.20 Medium- 0.50 Large- > 0.80
84
What is the calculation for normal distribution?
qnorm(c(0.025, 0.975), mean = 0, sd = 1.291)
85
What does this equation mean? 1 - pnorm(2.53, mean = 3, sd = 1.291)
Probability of a type II error
86
What does this code mean? pnorm(2.53, mean = 3, sd = 1.291)
Power
87