Sample mean is a point estimate Represents a very precise statement Not sure how accurate it is – due to sampling error

- sample means vary in a predictable way, we can estimate the likelihood of the population mean being within a certain range - to work this out, we need to have an idea of - > the centre of the distribution - > the population mean - > the spread of the distribution (which for confidence intervals is the standard error)

- what is the range of likely values of the population mean - 95% of samples are within 2 standard errors of the population mean - therefore 95% chance that my sample mean is within 2 standard errors of the actual population mean - This means, there is a 95% probability that the population mean is between two points - > x̄ -2σ(x̄) & x̄ +2σ(x̄) - Because of all of the potential samples means we could get from a population [ ] - [ ]95% will be in this range, and we call this the 95% confidence interval.

The 95% Confidence Interval (CI95) is a range of scores, centred on a sample mean, within which the population mean occurs 95 times out of 100. On average the 95% CI does not include the population mean 5 times out of 100

- mean plus the margin of error - the highest value of the range

- mean minus the margin of error - the lowest value of the range

- mean +/- the margin of error - lower boundary and upper boundary

- As sample size increases precision increases (interval gets narrower) - more people allows more accurate intervals

- If you have an infinite number of scores - the t-distribution is the z-distribution - as you change sample size, as you decrease your degrees of freedom the distribution get’s flatter and wider. - to calculate t, you need to know what your degrees of freedom are. - a t-value of 2.5 SD, how extreme that value is in the t-distribution depends on the degrees of freedom. A value of 2.5 is quite extreme with an n of 60, there is only a small area under the curve to the right of the value. But with an n of 5, we can be less confident of capturing the mean - so our value of 2.5 is a less extreme estimate as there's a larger area left over under the curve.

Week 5 Flashcards by Melanie Powell

Point estimate

Sample mean is a point estimate
Represents a very precise statement
Not sure how accurate it is – due to sampling error

How well did you know this?

Not at all

Perfectly

Confidence intervals

sample means vary in a predictable way, we can estimate the likelihood of the population mean being within a certain range
to work this out, we need to have an idea of
> the centre of the distribution
> the population mean
> the spread of the distribution (which for confidence intervals is the standard error)

How well did you know this?

Not at all

Perfectly

Standard error

the standard deviation divided by the square root of the number of observations.

How well did you know this?

Not at all

Perfectly

Representative samples

In a normally distributed population 95% of scores are within 2 SD of the mean
In a sampling distribution of the mean 95% of sample means are within 2 standard errors of the population mean
68 / 95 / 99.7
must be representative
must be at least 30 datums

How well did you know this?

Not at all

Perfectly

Single sample logic

what is the range of likely values of the population mean
95% of samples are within 2 standard errors of the population mean
therefore 95% chance that my sample mean is within 2 standard errors of the actual population mean
This means, there is a 95% probability that the population mean is between two points
> x̄ -2σ(x̄) & x̄ +2σ(x̄)
Because of all of the potential samples means we could get from a population [] - []95% will be in this range, and we call this the 95% confidence interval.

How well did you know this?

Not at all

Perfectly

95% CI

The 95% Confidence Interval (CI95) is a range of scores, centred on a sample mean, within which the population mean occurs 95 times out of 100.
On average the 95% CI does not include the population mean 5 times out of 100

How well did you know this?

Not at all

Perfectly

CI formulas

95% Confidence Interval = CI95 = x̄ ± 2σ(x̄)
• By the same logic:
• 68% Confidence Interval = CI68 = x̄ ± 1σ(x̄)
• 99.7% Confidence Interval = CI99.7 = x̄ ± 3σ(x̄)
• CI(p) = x̄ ± zσ(x̄)
• Where:
• p = probability you will include the population mean
• z = “z critical” = z score that borders the middle p % of scores in the standard normal distribution

How well did you know this?

Not at all

Perfectly

Margin of error

z score multiplied by standard error

How well did you know this?

Not at all

Perfectly

Upper bound

mean plus the margin of error

- the highest value of the range

How well did you know this?

Not at all

Perfectly

Lower bound

mean minus the margin of error

- the lowest value of the range

How well did you know this?

Not at all

Perfectly

Reporting CIs

mean +/- the margin of error

- lower boundary and upper boundary

How well did you know this?

Not at all

Perfectly

CI influence on z scores

As Confidence Level increases precision of estimate decreases (interval gets wider)
As C increases value of z* increases
> probability of accuracy increases as range increases

How well did you know this?

Not at all

Perfectly

CI influence on standard deviation

As variation in the population goes down precision increases (interval gets narrower)
the more similar people are, the better we can predict the range of the mean

How well did you know this?

Not at all

Perfectly

CI influence on n

As sample size increases precision increases (interval gets narrower)
more people allows more accurate intervals

How well did you know this?

Not at all

Perfectly

T scores

If you have an infinite number of scores - the t-distribution is the z-distribution
as you change sample size, as you decrease your degrees of freedom the distribution get’s flatter and wider. - to calculate t, you need to know what your degrees of freedom are.
a t-value of 2.5 SD, how extreme that value is in the t-distribution depends on the degrees of freedom. A value of 2.5 is quite extreme with an n of 60, there is only a small area under the curve to the right of the value. But with an n of 5, we can be less confident of capturing the mean - so our value of 2.5 is a less extreme estimate as there’s a larger area left over under the curve.

How well did you know this?

Not at all

Perfectly

When SD is unknown for CI calc

Study These Flashcards

CI95 = x̄ ± t*(s/√n)

- where s is sample standard deviation and t is our t score for our CI

Value of CI

Study These Flashcards

Sample mean is a precise point estimate of uncertain accuracy
Confidence intervals are a less precise interval estimate of definable accuracy
Width of CIs indicate variability & generalisability
Confidence intervals allow comparisons across studies
If confidence intervals substantially overlap supports goal that both studies sample same population
If CIs don’t overlap, unlikely they have sampled the same population.

Distinct populations

Study These Flashcards

• Distinguishing populations
-> Defined by some characteristic
-> Can measure behaviour and show they are different
• Any group that behaves differently along any dimension could be called a distinct population
• Example:
- Extroverts vs Introverts
- Males vs Females

Hypotheses

Study These Flashcards

• The null hypothesis (H0)
-> that there is no relationship between the two variables that we are
investigating
-> The alternate hypothesis (HA)
• that there is a relationship between the two variables

Hypothesis testing

Study These Flashcards

Step 1: State Hypotheses (HO & HA)
Step 2: Calculate an appropriate test statistic
Step 3: Determine the probability of HO
Step 4: Evaluate the probability of HO & state your conclusion

Step 1, state hypothesis

Study These Flashcards

• State both HO & HA
• HA can be one-tailed or two-tailed
-> One tailed predicts direction of effect
e.g., The climate is getting warmer
-> Two tailed just predicts an effect, doesn’t predict the direction of the effect
e.g., The climate is changing

Step 2, calculate the test statistics

Study These Flashcards

to calculate the test statistic, we need to work in z-scores
Z-scores indicate how far a score is from the mean in terms of standard deviation units
The z-test assesses how far any individual sample mean is from the population mean in terms of standard errors.
So you can get an obtained z-score by dividing the difference between the sample mean and the population mean, by the standard error of the mean.

Step 3, determine probability of null hypothesis (two tailed)

Study These Flashcards

if we set our alpha value as 0.05, 95% confidence, in a 2 tailed test, our z-critical value is going to be plus or minus 1.96 (area divided among the two tails rather than all at end on one)
Anything above or below that value is going to count as significant

Step 3, determine probability of null hypothesis (one tailed)

Study These Flashcards

If we have a one tailed test, our z-critical value is 1.64
but we have to predict the direction, so we will either have a z-critical score of being greater than 1.64, or less than minus 1.64 to count as significant
If the obtained z-score for your distribution is greater than the z-critical score (in the positive or negative direction) is reported as p (the probability) is less than 0.05

Step 4, evaluating probability of null hypothesis and concluding

- if p is less than alpha, we can reject the null hypothesis. This means we can accept our alternative hypothesis - > So if the p value was .048, and the alpha is 0.05, we reject the null - if p is more than alpha we retain the null hypothesis. - > for example 0.058 (i.e. more than .05) - We must then state our conclusion, a statement that reflects the hypothesis, whether or not it was significant, and report our p-values. - We interpret and report the parameter and direction of the effect using descriptive statistics, such as the mean. - This type of hypothesis testing is referred to as the Z-test, and we can use it when we know the standard deviation of the population

Z and t distributions (William Gosset)

-> both distributions (z & t): • have a mean of zero • are symmetrical • are unimodal -> the t distribution is: • A flatter distribution • has a larger SD • depends on df, so t-distribution a family of distributions - more participants in sample -> closer to z distribution

Using a t-chart to test hypothesis without SD

- decide whether our hypothesis is 1 or 2 tailed (with or without direction). - Set our alpha criteria, the probability we will accept as significant - calculate our degrees of freedom (n minus 1) - work out whether the t statistic that we calculate from our sample is greater than the value that is on the chart - > So in many ways, whether or not we know the population standard deviation, whether we are calculating a z or t statistic, the process is quite similar

Determining which test to use

- is there a SD for population? - > yes for z test - > no for t test

Errors in hypothesis outcomes

- The problem starts if we reject the null, when we actually shouldn’t. - Our sample tells us that the null is unlikely, but in fact we have sampled that small proportion of extreme space and our sample is really just the same as the population. - > Type I error, we say there is an effect, where there is none - We think listening to Mozart makes babies smarter, but we have just accidentally sampled a smart group of babies. - > Type 2 error, we can retain the null, when we should not - We think listening to Mozart does not make babies smarter, when it actually does. * look up image

Type 1 error

* Saying the Null hypothesis is false when it is true * Seeing an effect when it is only sampling error * Probability of Type I error set by α * Like a jury finding an innocent person guilty * “I” for Illusion: seeing something that isn’t there

Type 2 error

* Saying the Null hypothesis is true when it is really false * Failing to find an effect when one exists * Happens if evidence is not enough, or is tainted * Like a jury concluding not guilty when the accused did commit the crime * Type II -> think B for blind: it’s there but you can’t see it

Balance of type 1 and type 2 errors

• Which is worse? -> Depends what you are investigating: - Does this treatment bring relief? • Type I -> false hope • Type II -> missed opportunity for relief - Does this treatment have serious side effects? • Type I -> Needlessly cautious • Type II -> Exposing people to serious side effects * Better measurement lowers both error types * As reliability & validity increase error reduces

Week 5 Flashcards

(32 cards)