Statistics Theory L9 = Inferences Using t-Distributions Flashcards by Palesa Mamabolo

If we have a large sample & random sampling, what can we conclude about the sampling distribution? (3)

Sampling distribution is centered on µ; where µ = mean of the population = mean of the sampling distribution.
Spread of the sampling distribution = SD(Ŷ) = σ/√n .
Shape of the sampling distribution is closer to normal than the population distribution.

How well did you know this?

Not at all

Perfectly

Standard error (SE) of any statistic?

= the estimate of the SD of its sampling distribution.

How well did you know this?

Not at all

Perfectly

Degrees of freedom (df) of the SE?

= the equivalent number of independent observations.

How well did you know this?

Not at all

Perfectly

Df equation?

df = n -1

How well did you know this?

Not at all

Perfectly

SE for sample average equation?

SE(Ŷ) = s/√n

How well did you know this?

Not at all

Perfectly

What two ratios do we use based on the sample average?

z-ratio (ƶ).
t-ratio (t).

How well did you know this?

Not at all

Perfectly

Equation for z-ratio (ƶ)?

ƶ = (estimate-hypothesised value) / (SD (Estimate))

*SD (Estimate) = σ/√n

How well did you know this?

Not at all

Perfectly

Conditions to use the ƶ ratio? (2)

We need to know the SD (Estimate) = σ/√n , the σ.
We need to have a large sample (n ≈ 30).

How well did you know this?

Not at all

Perfectly

NB about ƶ ratio? (2)

If the sampling distribution of the estimate is normal, the sampling distribution of ƶ is normal (µ = 0, σ2 = 1), & this is a “standard normal”.
Percentiles of N (0, 1) helps us to judge the certainty about parameter estimates & differences.

How well did you know this?

Not at all

Perfectly

Equation for t-ratio (t)?

t = (estimate-hypothesised value) / (SE (Estimate))

*SE (Estimate) = sample SD = s/√n .

How well did you know this?

Not at all

Perfectly

When can we use the t ratio?

When we have a small sample (n<30).

How well did you know this?

Not at all

Perfectly

t ratio attributes? (2)

Is wider than the standard normal (ƶ), because of the extra variability associated with estimating s.
Has more degrees of freedom, means a better estimation & reduced variability.

How well did you know this?

Not at all

Perfectly

NB about t ratio?

If Ŷ is an average from a random sample of size n from a normally-distributed population, the sampling distribution of the t-ratio is the Student’s t-distribution with n-1 degrees of freedom.

How well did you know this?

Not at all

Perfectly

When is a one-sided/one-tailed test used? (2)

When you expect one direction of change (increase only).
When a difference in the opposite direction is insignificant.

How well did you know this?

Not at all

Perfectly

Why is a one-sided/one-tailed test used? (2)

It reduces the critical value, making it easier to reject the null hypothesis in the direction of interest.
It increases your ability to detect an effect if it exists.

How well did you know this?

Not at all

Perfectly

When is a two-sided/two-tailed test used? (3)

When you just want to see if there is any significant difference, positive or negative.
When you’re testing for inequality.
When both extreme deviations are important in your analysis.

How well did you know this?

Not at all

Perfectly

Why is a two-sided/two-tailed test used? (2)

It ensures unbiased testing, as it considers both extremes.
It avoids missing an important result in the opposite direction.

How well did you know this?

Not at all

Perfectly

Scenarios where we use t-ratios to make an inference? (2)

Study These Flashcards

t-ratios for 1-sample inference & paired t-tests.
t-ratios for 2-sample inference.

t-ratios for 1-sample inference & paired t-tests attributes? (2)

Study These Flashcards

Compares one sample average/an average difference in paired data to a hypothesized value.
Response variable (y) is continuous & normally distributed.

t-ratios for 2-sample inference attributes? (3)

Study These Flashcards

Compares means from 2 independent samples.
Response variable (y) is continuous & normally distributed.
Predictor variable (x) is binary (categorical), group or population.

Eg for t-ratios for 1-sample inference & paired t-tests?

Study These Flashcards

Schizophrenia example.

Schizophrenia example

Scientists identified a sample of identical twins where one twin had been diagnosed with schizophrenia & the other had not. The scientists used an imaging device to measure the volume (cm^3) of the left hippocampus?

Given an output, focus on:
- n (the 1st tibble).
- μ (AvDiff).
- s (SDDiff).

Is there evidence of an effect on volume of the hippocampus? What is the scope of inference? (8)

Study These Flashcards

(i) Hypotheses

Ho: μ = 0 (1-tailed) and random sampling gave us purely by chance a representative sample (which could happen).

Ha: μ ≠ 0 (2-tailed).

(ii) From data/output:

t = (Ŷ - 0) / s/√n = (0.199 - 0)/ 0.238/√15 = 3.236.

(iii) df = n-1 = 15-1 = 14.

(iv) After getting t-statistic, illustrate it on graph to see if it arose by chance.

(v) Calculate p-value in R using:

1 - pt (3.236, 14) = 0.002988. Since it’s 2-tailed, p x 2 = 0.006.

In test, Prof. Jason will give us the p-value.

(vi) We are still uncertain about this estimate (need to do CI afterwards).

(vii) Therefore, there is convincing evidence that schizophrenia is influenced by volume in the hippocampus (t = 3.236; df = 14; p = 0.006).

(viii) We cannot infer cause and effect as this was an observational study, and we cannot infer to the population beyond the sample as there was no random sampling, therefore the results only apply to the twins in this study.

Eg of t-ratios for 2-sample inference?

Study These Flashcards

Finch example.

Finch example

These data are measurements of the beak depth of finches, the year before a drought (1976) and the year after a drought (1978), on the island of Daphne Major in the Galapagos.

Given output, focus on:
- n1; n2 (Count).
- μ1; μ2 (AvDepth).
- s1; s2 (SDDepth).
- sp (Pooled SD).
- SE (Ŷ2-Ŷ1) [SE of the average difference].

Is there a difference in the beak depth of finches between the years? Scope of inference? (10)

Study These Flashcards

(i) Assume σ1 = σ2 = σ (common SD between the two groups).

(ii) Calculate pooled SD (sp):

sp = √[(n1 - 1)s1^2 + (n2 - 1)s2^2]/ (n1 + n2 - 2)

= √[(89 - 1)(1.04)^2 + (89 - 1)(0.906)^2]/ (89 + 89 - 2)

= 0.973.

(iii) df = n1 + n2 - 2 = 89+89-2 = 176.

(iv) SE (Ŷ2-Ŷ1) = sp √(1/n1) + (1/n2)

= 0.973 √(1/89) + (1/89)

= 0.1459.

(v) t = [(Ŷ2-Ŷ1) - (hypothesised value)] / SE (Ŷ2-Ŷ1)

= [(10.14 - 9.47) - 0] / 0.456

= 4.58.

(vi) Use t-statistics to illustrate on the graph (draw graph) & where p-values lie.

(vii) From R: 1 - pt (4.58, 176) = very very small.

(viii) Calculate CI’s (95% CI: 0.3807, 0.9564).

(ix) There is strong evidence of a difference between the beak depths of finches before and after the drought (t = 4.58; df = 176; p < 0.001). The estimated difference in beak depth between years was 0.6685 (95% CI: 0.3807, 0.9564).

(x) We cannot infer cause & effect as there was no random assignment however, we can infer to the larger population because they sampled from the whole population.

Testing a hypothesis about the difference between means attributes? (5)

- To do this we need to calculate the t-statistic. - As for the t-ratio previously, we assume a true null hypothesis. - What are the likely values of t-ratios if the null hypothesis is true? - Based on the data, is that hypothesised value reasonable? - The value we calculate tells us how many SEs the estimate is from the hypothesised parameter.

p-value?

= measures the credibility of a statistical hypothesis.

p-value for a t-test?

= probability of obtaining a t-ratio as or more extreme than the t-statistic observed from our data as evidence against the H0.

Small p-value?

= small probability & that estimate is not arising by chance.

Large p-value?

= large probability and that estimate is arising by chance.

When/Why does a small p-value occur? (2)

Occurs if: - Ho is correct but sample is not representative. - H0 is incorrect.

When/Why does a large p-value occur? (3)

Occurs if: - You're unable to exclude/reject Ho. - Conclusion: "the data are consistent with Ho". - Not: "Ho is true" (don't say this).

1-sided/2-sided p-values attributes? (3)

- Do we want to conclude larger/smaller, or is it enough to conclude "not equal"? - This question must be established before analysis. - When reporting the finding, always report whether the p-value is one/two-sided.

Interpretation of p-values ("strength of evidence" zones)? (4)

(i) <0.001 - 0.01 = "convincing evidence". (ii) 0.01 - 0.05 = "moderate evidence". (iii) 0.05 - 0.10 = "suggestive but inconclusive". (iv) >0.10 = "no evidence".

p value: <0.001 - 0.01?

"convincing evidence"

p value: 0.01 - 0.05?

"moderate evidence"

p value: 0.05 - 0.10?

"suggestive but inconclusive"

p value: >0.10?

"no evidence"

NB for p-values? (4)

- Smaller the p-value, the stronger the evidence that the null hypothesis (Ho) is incorrect. - Preferred wording: small p = evidence of an effect/difference; large p = no evidence of an effect/difference. - Never use the term "significant" when making a statistical conclusion, rather talk about the strength of evidence. - As n increases, t-statistic increases, then p-value decreases, regardless of the difference between Ŷ and μ.

Confidence Interval (CI) attributes? (5)

- Help us quantify any uncertainty we have about an estimate (purpose). - Used when we're uncertain about our estimate (when to use). - Wide CI = poor estimate. - Narrow CI = good/precise estimate. - More data decreases the size of the interval.

How to calculate the CI? (3)

(i) Multiply by SE(Ŷ) to change from the range. (ii) Shift the distribution so that centre is our estimate of μ. (iii) Take values that contain middle 95% of the distribution.

CI formula? (3)

qt (0.975, df) [Ŷ2-Ŷ1] ± [qt (0.975, df)][SE (Ŷ2-Ŷ1)] * If using logged values, don't forget to use exp of your CI.

Jason's take on inferential statistics? (2)

He sees inferential statistics as falling under 2 paradigms/mentalities: (1) Estimate the size of effect & the accuracy of the estimate (focus on parameter estimates & 95% CIs). (2) Hypothesis testing & p-values.

Jason's suggested paradigm 1?

= works for both randomised experiments & observational studies.

Jason's suggested paradigm 2?

= works for laboratory experiments under highly controlled conditions.

Statistics Theory L9 = Inferences Using t-Distributions Flashcards

(44 cards)