Week 5 Flashcards

1
Q

Point estimate

A
  • Sample mean is a point estimate
  • Represents a very precise statement
  • Not sure how accurate it is – due to sampling error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Confidence intervals

A
  • sample means vary in a predictable way, we can estimate the likelihood of the population mean being within a certain range
  • to work this out, we need to have an idea of
  • > the centre of the distribution
  • > the population mean
  • > the spread of the distribution (which for confidence intervals is the standard error)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Standard error

A

the standard deviation divided by the square root of the number of observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Representative samples

A
  • In a normally distributed population 95% of scores are within 2 SD of the mean
  • In a sampling distribution of the mean 95% of sample means are within 2 standard errors of the population mean
  • 68 / 95 / 99.7
  • must be representative
  • must be at least 30 datums
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Single sample logic

A
  • what is the range of likely values of the population mean
  • 95% of samples are within 2 standard errors of the population mean
  • therefore 95% chance that my sample mean is within 2 standard errors of the actual population mean
  • This means, there is a 95% probability that the population mean is between two points
  • > x̄ -2σ(x̄) & x̄ +2σ(x̄)
  • Because of all of the potential samples means we could get from a population [] - []95% will be in this range, and we call this the 95% confidence interval.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

95% CI

A
  • The 95% Confidence Interval (CI95) is a range of scores, centred on a sample mean, within which the population mean occurs 95 times out of 100.
  • On average the 95% CI does not include the population mean 5 times out of 100
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

CI formulas

A

95% Confidence Interval = CI95 = x̄ ± 2σ(x̄)
• By the same logic:
• 68% Confidence Interval = CI68 = x̄ ± 1σ(x̄)
• 99.7% Confidence Interval = CI99.7 = x̄ ± 3σ(x̄)
• CI(p) = x̄ ± zσ(x̄)
• Where:
• p = probability you will include the population mean
• z
= “z critical” = z score that borders the middle p % of scores in the standard normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Margin of error

A

z score multiplied by standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Upper bound

A
  • mean plus the margin of error

- the highest value of the range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Lower bound

A
  • mean minus the margin of error

- the lowest value of the range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Reporting CIs

A
  • mean +/- the margin of error

- lower boundary and upper boundary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

CI influence on z scores

A
  • As Confidence Level increases precision of estimate decreases (interval gets wider)
  • As C increases value of z* increases
  • > probability of accuracy increases as range increases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

CI influence on standard deviation

A
  • As variation in the population goes down precision increases (interval gets narrower)
  • the more similar people are, the better we can predict the range of the mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

CI influence on n

A
  • As sample size increases precision increases (interval gets narrower)
  • more people allows more accurate intervals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

T scores

A
  • If you have an infinite number of scores - the t-distribution is the z-distribution
  • as you change sample size, as you decrease your degrees of freedom the distribution get’s flatter and wider. - to calculate t, you need to know what your degrees of freedom are.
  • a t-value of 2.5 SD, how extreme that value is in the t-distribution depends on the degrees of freedom. A value of 2.5 is quite extreme with an n of 60, there is only a small area under the curve to the right of the value. But with an n of 5, we can be less confident of capturing the mean - so our value of 2.5 is a less extreme estimate as there’s a larger area left over under the curve.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When SD is unknown for CI calc

A

CI95 = x̄ ± t*(s/√n)

- where s is sample standard deviation and t is our t score for our CI

17
Q

Value of CI

A
  • Sample mean is a precise point estimate of uncertain accuracy
  • Confidence intervals are a less precise interval estimate of definable accuracy
  • Width of CIs indicate variability & generalisability
  • Confidence intervals allow comparisons across studies
  • If confidence intervals substantially overlap supports goal that both studies sample same population
  • If CIs don’t overlap, unlikely they have sampled the same population.
18
Q

Distinct populations

A

• Distinguishing populations
-> Defined by some characteristic
-> Can measure behaviour and show they are different
• Any group that behaves differently along any dimension could be called a distinct population
• Example:
- Extroverts vs Introverts
- Males vs Females

19
Q

Hypotheses

A

• The null hypothesis (H0)
-> that there is no relationship between the two variables that we are
investigating
-> The alternate hypothesis (HA)
• that there is a relationship between the two variables

20
Q

Hypothesis testing

A
  • Step 1: State Hypotheses (HO & HA)
  • Step 2: Calculate an appropriate test statistic
  • Step 3: Determine the probability of HO
  • Step 4: Evaluate the probability of HO & state your conclusion
21
Q

Step 1, state hypothesis

A

• State both HO & HA
• HA can be one-tailed or two-tailed
-> One tailed predicts direction of effect
e.g., The climate is getting warmer
-> Two tailed just predicts an effect, doesn’t predict the direction of the effect
e.g., The climate is changing

22
Q

Step 2, calculate the test statistics

A
  • to calculate the test statistic, we need to work in z-scores
  • Z-scores indicate how far a score is from the mean in terms of standard deviation units
  • The z-test assesses how far any individual sample mean is from the population mean in terms of standard errors.
  • So you can get an obtained z-score by dividing the difference between the sample mean and the population mean, by the standard error of the mean.
23
Q

Step 3, determine probability of null hypothesis (two tailed)

A
  • if we set our alpha value as 0.05, 95% confidence, in a 2 tailed test, our z-critical value is going to be plus or minus 1.96 (area divided among the two tails rather than all at end on one)
  • Anything above or below that value is going to count as significant
24
Q

Step 3, determine probability of null hypothesis (one tailed)

A
  • If we have a one tailed test, our z-critical value is 1.64
  • but we have to predict the direction, so we will either have a z-critical score of being greater than 1.64, or less than minus 1.64 to count as significant
  • If the obtained z-score for your distribution is greater than the z-critical score (in the positive or negative direction) is reported as p (the probability) is less than 0.05
25
Q

Step 4, evaluating probability of null hypothesis and concluding

A
  • if p is less than alpha, we can reject the null hypothesis. This means we can accept our alternative hypothesis
  • > So if the p value was .048, and the alpha is 0.05, we reject the null
  • if p is more than alpha we retain the null hypothesis.
  • > for example 0.058 (i.e. more than .05)
  • We must then state our conclusion, a statement that reflects the hypothesis, whether or not it was significant, and report our p-values.
  • We interpret and report the parameter and direction of the effect using descriptive statistics, such as the mean.
  • This type of hypothesis testing is referred to as the Z-test, and we can use it when we know the standard deviation of the population
26
Q

Z and t distributions (William Gosset)

A

-> both distributions (z & t):
• have a mean of zero
• are symmetrical
• are unimodal

-> the t distribution is:
• A flatter distribution
• has a larger SD
• depends on df, so t-distribution a family of distributions
- more participants in sample -> closer to z distribution

27
Q

Using a t-chart to test hypothesis without SD

A
  • decide whether our hypothesis is 1 or 2 tailed (with or without direction).
  • Set our alpha criteria, the probability we will accept as significant
  • calculate our degrees of freedom (n minus 1)
  • work out whether the t statistic that we calculate from our sample is greater than the value that is on the chart
  • > So in many ways, whether or not we know the population standard deviation, whether we are calculating a z or t statistic, the process is quite similar
28
Q

Determining which test to use

A
  • is there a SD for population?
  • > yes for z test
  • > no for t test
29
Q

Errors in hypothesis outcomes

A
  • The problem starts if we reject the null, when we actually shouldn’t.
  • Our sample tells us that the null is unlikely, but in fact we have sampled that small proportion of extreme space and our sample is really just the same as the population.
  • > Type I error, we say there is an effect, where there is none - We think listening to Mozart makes babies smarter, but we have just accidentally sampled a smart group of babies.
  • > Type 2 error, we can retain the null, when we should not - We think listening to Mozart does not make babies smarter, when it actually does.
  • look up image
30
Q

Type 1 error

A
  • Saying the Null hypothesis is false when it is true
  • Seeing an effect when it is only sampling error
  • Probability of Type I error set by α
  • Like a jury finding an innocent person guilty
  • “I” for Illusion: seeing something that isn’t there
31
Q

Type 2 error

A
  • Saying the Null hypothesis is true when it is really false
  • Failing to find an effect when one exists
  • Happens if evidence is not enough, or is tainted
  • Like a jury concluding not guilty when the accused did commit the crime
  • Type II -> think B for blind: it’s there but you can’t see it
32
Q

Balance of type 1 and type 2 errors

A

• Which is worse? -> Depends what you are investigating:
- Does this treatment bring relief?
• Type I -> false hope
• Type II -> missed opportunity for relief

  • Does this treatment have serious side effects?
    • Type I -> Needlessly cautious
    • Type II -> Exposing people to serious side effects
  • Better measurement lowers both error types
  • As reliability & validity increase error reduces