block 5-8 Flashcards

1
Q

fill in the blanks:

if sample is large enough, the distribution of sample means ____(will/ won’t) be normal, even though the distribution of data in pop _____ (is/ isn’t) normal

A

if sample is large enough, the distribution of sample means will be normal, even though the distribution of data in pop is NOT normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The mean of what, is the true population mean?

A

The mean of the sampling distribution of means is the true population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

fill in the blanks:

Since the sample distribution is normal: ___% of the sample mean falls within ____ times the standard error.

A

Since the sample distribution is normal: 95% of the sample mean falls within 1.96 times the standard error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the sampling distribution of a mean?

A

The sampling distribution of a mean = the distribution of sample means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is confidence interval?

A

Interval around the estimated mean where we have a certain level of confidence that it contains the true mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Does the confidence interval tell us the probability that an interval contains the true mean?

A

No, it does not tell us the probability that the confidence interval contains the true mean, because true mean is a fixed point and it either is or is not in the interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Fill in the blank:

The confidence interval extends to _____ side of the mean by a multiple of _______.

Most commonly is calculated to be what ___

A

The confidence interval extends to either side of the mean by a multiple of the standard error.

Mostly commonly calculated as 95% CI, this extends 1.96 SE either side of the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Fill in the blank:

If we took thousands of samples, and for each sample calculated the mean and associated 95% confidence interval, we would expect ___% of these confidence intervals to include the population mean.

A

If we took thousands of samples, and for each sample calculated the mean and associated 95% confidence interval, we would expect 95% of these confidence intervals to include the population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the formula for calculating the confidence interval?

What information is necessary?

A

95% confidence interval = x ± 1.96 SE (x)
(the x’s have lines above them)

x = mean height
SE (x) = standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What multiple is used when calculating a:
90% CI
95% CI
99% CI

A

90% CI – 1.56
95% CI – 1.96
99% CI – 2.56

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the z value, and what is its formula

A

z value = test statistic

formula
(estimated mean - hypothesized mean)/ SE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

(pg 29, BS05)

If the “Distribution of x if Null Hypothesis were true”, was plotted as a normal distribution, which part of the distribution curve corresponds with the p value?

Again, what is the z value?

A

Z value is the test statistic, to calculate the difference between the estimated mean and the hypothesized mean.

The P value is area under the curve that’s distal to the z values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does a large p-value mean?

A

If the p-value is large, the chance of observing the value as extreme as the sampled one is high if the Null Hypothesis were true.

-or-
The larger the p-value, the less evidence against Null Hypothesis (no difference).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A small p value means?

A

The chance of observing this value if the Null Hypothesis were true, is low. More evidence against Null. That the value is less likely due to sample variation and more likely to reflect a real difference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If the hypothesized mean is NOT included in the 95% CI, this is evidence FOR or AGAINST the null hypothesis?

A

If hypothesized mean is NOT included in the 95% CI, this is evidence against the Null Hypothesis, meaning there is likely a real difference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

fill in the blanks

When the sample size is small, ______ is not a good approximation, rather ______ is used.

A

When the sample size is small, normal distribution is not a good approximation, rather t-distribution is used.

17
Q

What determines the shape of a t-distribution?

A

Degrees of freedom.

18
Q

What are degrees of freedom?

A

Degrees of freedom are a measure of how small the sample size is. It determines the shape of the t-distribution curve.

t distribution = sample size minus 1

19
Q

The smaller the degrees of freedom mean what?

A

Smaller degrees of freedom mean lower the probabilities around the mean and higher the probabilities at the tails. The curve is shorter and wider.

20
Q

How does increasing degrees of freedom affect the t-distribution curve in relation to the normal distribution curve?

A

Bigger degrees of freedom mean larger sample sizes, which means the t-distribution resembles the normal distribution curve more and more.

21
Q

What is the formula for the 95% CI of mean for small sample size?

A

estimated mean +/- multiplier x standard error of estimate

multiplier = value of t corresponding to a two sided p = 0.05

22
Q

In what situations are two tailed and one tailed used for p-values?

A

Most commonly, 2 tailed used because the alternative hypothesis simply states the estimated mean is not equal to the hypothesized mean.

H1: u ≠ u0

Less commonly, 1 tailed is used when alternative hypothesis is:

H1: u > u0. or u < u0

23
Q

Historically, what p value is deemed “enough” evidence

A

p < 0.05

24
Q

Fill in the blank:

When the result of a test has a P-value smaller than 0.05, the 95% confidence interval _____ (will/ won’t) include the hypothesized value.

A

When the result of a test has a P-value smaller than 0.05, the 95% confidence interval will not include the hypothesized value.

25
Q

What is paired data?

What is a common example of this for quantitative data?

A

Paired samples occur when each observation in the first sample is matched in the 2nd sample.

Example is doing repeated observations on the same person with quantitative data.

  • When measuring a reaction to a TB test, two field workers will measure the skin reaction on the same person.
  • Pulse was checked for each runner before and after the run.
26
Q

What is unpaired data?

What are examples

A

Unpaired data are when individual observations in one sample are completely independent of observations made in another sample.

Ex:

  • A gp of patients were divided into 2 groups. Gp A received a new novel drug treatment, whereas Gp B received conventional treatment. Effect of the drug was measured and compared.
  • Students were divided into 2 gps. Gp A received a new computer teaching method, and Gp B received a conventional teaching method. Test scores were compared.
27
Q

When analyzing paired data, what’s the first step?

A

Calculate the difference between the two individual observations in each pair.

28
Q

How can we assess whether there’s a difference between measurements from 2 samples? what can we calculate to answer the above question?

A

Can calculate:

  • mean difference between the two samples
  • confidence interval
  • then run a test to see if it’s significantly different from zero
29
Q

What information is needed to calculate the CI and run a test to see if it’s significantly different than zero?

A
  • mean difference
  • standard deviation of differences
  • standard error of mean difference
  • SD needed to calculate SE
30
Q

What’s the formula for calculating the standard error (SE)?

A

standard deviation divided by square root of individual/ observations

31
Q

When calculating the 95% CI, what multiplier is used? How does this chance if the sample size is small?

A

If sample size is large, the multiplier is 1.96.

If sample size is small, need to use a multiplier found on a t-distribution table.

32
Q

What does the z value tell us?

A

Z value is the test statistic, to calculate the difference between the estimated mean and the hypothesized mean.

Calculating a z-value in the hypothesis test, tells us how many standard errors away the observed mean is from the centre of the distribution defined by the null hypothesis.

The number of SE between the observed difference in means and the centre of the distribution defined by the null hypothesis.

33
Q

State the difference when analyzing the mean of paired and unpaired data.

A

Unpaired:
the difference between two independent means

Paired:
the mean difference of two paired observations

34
Q

What’s the z value?

A

Z value is the test statistic, to calculate the difference between the estimated mean and the hypothesized mean.

Calculating a z-value in the hypothesis test, tells us how many standard errors away the observed mean is from the centre of the distribution defined by the null hypothesis.

The number of SE between the observed difference in means and the centre of the distribution defined by the null hypothesis.

35
Q

How do you find the “multiplier” when calculating the 95% CI for a small batch?

A

The formula for 95% CI is
estimated mean +/- multiplier x SE

In the 2 tailed p table, the DF (degrees of freedom) is n-1. And the p is set at 0.05.

36
Q

What is a difference when calculating CI and test hypothesis for small sample unpaired/ independent data?

A

The difference is in the calculation of SE.

Formula for degrees of freedom is:
n1 + n2 - 2

37
Q

What is the difference between SD (standard deviation) and SE (standard error)?

A

SD- tries to quantify the variation from the mean within a set of measurements (spread of the data points).

SE- tries to quantify the variation of the MEANS of several sets of data

38
Q

What is the term for data which can take one of two values?

A

binary

39
Q

What is a proportion?

A

A fraction.

The proportion of mothers with hypertension was 329 / 1310 = 0.251

Expressed as a percentage this is

0.251 x 100 = 25.1%