Inferences Flashcards

1
Q

What are the quantile values for a normal distribution? (95%)

A

+/- 1.96

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the purpose of the statistic?

A

estimating the unknown parameter μ by using sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the standard error do?

A
  • tell us the typical estimation error (x̄ -μ)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we know if our estimation is accurate?

A
  • no bias
  • precision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is s (sample SD) used?

A

Sigma is unknown and can’t be computed
only one sample size available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a confidence interval?

A
  • Range of plausible values for parameter
  • more likely to capture true value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is a confidence interval created?

A

Deciding on a confidence level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a confidence level?

A

0-1 specified by researcher
- larger confidence level = larger confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why do we not use -1.96 when estimating?

A
  • unsure about sigma’s true value -> distribution references no longer normal
  • meaning higher distribution in tails because of uncertainty
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is a T-distribution used?

A
  • higher probability in the tails due to extra uncertainty of not knowing sigma.
  • similar to standard distrbutuion
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

For a t-distirbution the confidence interval depends on…

A

-/+ t* a.k.a t(n-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What happens if you have a confidence interval of 95%?

A

95% of the intervals will contain the true parameter value
5% won’t contain the true parameter value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are research hypothesis used for?

A
  • testing a claim about population parameter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the null hypothesis?

A
  • Skeptical claim that nothing has changed
  • Assumed to be true before experiment
  • No change, no effect or relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the Alternative hypothesis?

A
  • Claim need to find evidence for
  • if H1 wanted, H0 need convincing to change
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does it mean if H0 is true?

A
  • very unlikely for random sample to give value of statistic in support of H1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When is the result a fluke rather than H1 being true?

A

if it’s very likely for value statisitc to give sample statistic when H0 is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a null distirbution?

A

Distribution of t-statistic assuming H0 to be true.
The value expected to be seen when H0 is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What forms the sampling distribution?

A
  • the possible values of sample means and their probabilities as they vary sample to sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the t-statistic?

A

Standardising the null distribution(& sample mean) using SE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does the t-statistic measure?

A
  • How many standard errors away x̄ is from μ0
  • compares difference between sample and hypothesised mean to expected variation in means due to random sampling.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are you likely to see if μ is truly μ0?

A

t-score between -2 and +2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the difference between tobs and tstar?

A

tobs = the observed t-value, t-statistic
tstar = t(n-1) the confidence interval component

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How does the presumption of innocence relate to hypothesis testing?

A
  • We always assume null hypothesis to be true before experiment
  • if considering H1(alternative) to be true collect clear evidence against null to reasonably reject
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the test of significance?

A
  • procedure for testing claim about population parameter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the process of the test of significance?

A
  • weight evidence against null hypothesis
  • evidence in statistics correspond to numerical summary of sample data
  • evidence beyond reasonable doubt
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a p-value?

A
  • measures the strength of evidence against null hypothesis
  • uses probability to say how strong the evidence is
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does the P-value show?

A
  • probability of having value of t-statisitc at least equal to that observed (assuming H0 true)
  • p-value is the probability, of obtaining a value of the t-statistic at least as
    extreme as that observed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How is the direction of the extreme for p-value determined?

A
  • by direction specified by H1
    > find probability of larger t scores than observed
    < find probability of smaller t-score than observed
    ≠ uses both tails
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

When is the p-value against H0?

Why?

A
  • the smaller the p-value = stronger evidence against H0
  • observed result unlikely to occur
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

When does the p-value fail to provide evidence against H0?

Why?

A
  • Large p-values fail to provide evidence against H0
  • Very likely to occur
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the significant level for?

A

Standard evidence against H0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What does it mean when ‘p-value > 0.1’?

A
  • little to no evidence against H0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What does it mean when ‘p-value < 0.01’

A

very strong evidence against H0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What does it mean when the p-value < α?

A

data significantly sigificant at level α -> reject H0 & accept H1

36
Q

What does it mean when p-value > α?

A

data not significantly significant at α level -> Don’t reject H0

37
Q

What are the 2 goals of collecting sample data?

A
  • estimation: estimate population parameter
  • hypothesis testing: testing whether hypothesised parameter is plausible
38
Q

What is used in estimation?

A
  • average of observations -> x̄ estimate
  • SE measures precision of estimate
  • confidence interval -> range of plausible values for population mean
39
Q

What happens when the confidence level is increased?

A

The confidence interval is wider

40
Q

How is the goal of hypothesis testing achieved?

A
  • computing the test statistic, measuring the distance between sample data and hypothesised null
41
Q

what is the null distribution of t-distribution?

A

t-statistic

42
Q

when H0 true t-statistic follows…

Why?

A

t(n-1) distribution

sample mean will fluctuate around hypothesised mean if true (0)

the null distribution shows possible distances between x̄ and μ0 when H0 true

43
Q

When is H0 doubted?

A

when observed sample gives t-statistic that’s unlikely

44
Q

How is the t-statistic computed in R?

A

n <- nrow(data_sample)
s <- sd(data_sample$x)
se <- s/sqrt(n)
mu0 <- 20
tobs <- (xbar - mu0) / se
tobs

45
Q

How is the p-value for both sides computed?

A

pvalue <- 2*pt(abs(tobs), df = n-1, lower.tail = FALSE)

46
Q

What does it mean when a twice the area to the right t-statistic is observed?

A
  • Probability of observing t-statisitc at least same distance from 0 as observed statistic when H0 is true
47
Q

How do we make a decision for the P-value?

A

If p-

48
Q

What are the critical values?

A

areas that cut area of significance either to the left or the right (or both). (-t* & +t*)

49
Q

When do we reject H0 in the critical value?

A

when t ≤ -t* t ≥ + t*

50
Q

What is the relationship of the Confidence interval and μ?

A

Confidence interval tells us the value we hope to include μ

51
Q

What is the R code for two-tailed p-value?

A

2 * pt(abs(tobs), df = n-1, lower.tail = FALSE)

52
Q

When do we reject H0 in p-value?

A

Reject H0 if p ≤ α

53
Q

When do we fail to reject H0?

A

Do not reject H0 if p > α

54
Q

What is the code for the upper and lower critical value?

A

qt(c(0.025, 0.975), df = n-1)

55
Q

When do we fail to reject H0 in a critical value?

A
  • When t is greater than -t* but smaller than +t*
    -t* < t < +t*
56
Q

What does tobs mean?

A

value of t-statistic for observed sample

57
Q

What does the 95% confidence interval tell us?

A
  • range of values for parameter NOT rejected at 5% significance level in two-sided hypothesis test
58
Q

What is the difference between statistical significance and importance?

A
  • sample results unlikely observed just from random variation due to random sampling when H0 = true in population
  • importance = practical importance
59
Q

Why do we need both the confidence interval and hypothesis?

A
  • p-value provides strength of evidence against H0 -> smaller p-value = stronger evidence
  • Confidence interval = magnitude of population parameter
60
Q

What is a type I error?

A

Rejecting a true null hypothesis

61
Q

What is a type II error?

A

not rejecting a false null hypothesis

62
Q

What is the worst error in hypothesis testing and why?

A
  • Type I error
  • Null hypothesis is true but is being rejected
    (Only reject when evidence = beyond reasonable doubt)
63
Q

What does the type I error correspond to in significance testing?

A

Alpha α
the distribution of rejecting H0 when H0 is in fact true

64
Q

In the type I error in significance testing what do the shaded areas mean?

A
  • probability of rejecting H0 in the distribution where H0 is true
65
Q

What does the type II error correspond to in significance testing?

A
  • Beta
66
Q
A
67
Q

What is β calculation?

A

β = distribution of NOT rejecting H0 when H0 is false

68
Q

What does the β calculation mean?

A
  • Probability of not rejecting H0 in the distribution of alternative when H1 is false
    as, T-statisitc in between two critical values
69
Q

What does the power calculation mean?

A

Probability of correctly rejecting false null hypothesis
Rejecting H0 when null is false

70
Q

What is the power calculation?

A

Power = 1 - β = P(rejecting H0|H0 is false)

71
Q

What is the effect of increasing the sample size in significance testing?

A

n increases as σ decreases -> distribution becomes narrower
Power increases and B decreases -> more probability to the right

72
Q

What is the effect of changing the alternative hypothesis in significant testing?

A

^ distance from alternative and null hypothesis -> increase in power as more probability to the right of critical value

73
Q

What is the effect of increasing the σ in significance testing?

A
  • Larger spread in sampling distribution -> less probability beyond critical value and more in between
  • power decreases and probability of B increases -> wider distribution means more probability in-between two critical values
74
Q

What is the effect of decreasing σ in significant testing?

A
  • probability beyond critical values increases -> power increased
75
Q

What is the effect of increasing α from 0.05 to 0.1?

A

Critical values shift inwards; more probability in tails
- power increased and B decreased

76
Q

What else affects power?

A
  • power ^ as sample size ^
  • power ^ by making α larger
  • power ^ true value further away from Mu0 in H0
77
Q

How is a Type I error minimised?

A
  • decreasing α making it harder to reject H0 but increases probability of making type II error
78
Q

How is a type II error decreased?

A
  • choosing larger α -> easier to reject H0
  • ^ chance of type I error as true H0 may be reduced
79
Q

What is effect size?

A
  • Magnitude of difference between true μ (xbar) and hypothesised μ0
  • shows statisitcal significance betweenxbar and μ0
80
Q

What does a bigger effect size mean?

A

the bigger the distance = bigger the power

81
Q

How is the effect size measured?

A

Cohen’s D

82
Q

What does Cohen’s D do?

A

-reports how many standardised deviations lies between the two means

83
Q

What is a small, medium and large effect size/Cohen’s D?

A

Small - <0.20
Medium- 0.50
Large- > 0.80

84
Q

What is the calculation for normal distribution?

A

qnorm(c(0.025, 0.975), mean = 0, sd = 1.291)

85
Q

What does this equation mean?
1 - pnorm(2.53, mean = 3, sd = 1.291)

A

Probability of a type II error

86
Q

What does this code mean?
pnorm(2.53, mean = 3, sd = 1.291)

A

Power

87
Q
A