Section B.2: Hypothesis Testing (1) Flashcards

1
Q

What is a hypothesis?

A

A hypothesis is a proposed explanation for a phenomenon, based on existing knowledge or initial observations, that can be proven or disproven.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What makes a good hypothesis? (5)

A

A good hypothesis is testable, specific, falsifiable, relevant, and clear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Null hypothesis

A

The null hypothesis states that there is no relationship between the variables being tested, denoted by H0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Alternative hypothesis

A

The alternative hypothesis states that there is a relationship between the variables being tested, it is denoted by H1, and contradicts the H0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Directional hypothesis

A

A directional hypothesis is a statement that predicts the direction of the relationship between variables being tested, it is known as a one-tailed hypothesis.

eg. positive or negative relationship…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is hypothesis testing?

A

Hypothesis testing is a statistical method used to test a hypothesis about a population parameter using sample data, it has two categories: Parametric tests, and Nonparametric tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Steps in hypothesis testing (4)

A
  1. Formulate a null and alternative hypothesis
  2. Choose a statistical test based on the type of data, variables, and parameters
  3. Collect data and analyse it using the chosen test
  4. Determine whether the results support or reject the null hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Type 1 error

A

Type I error occurs when we reject a true null hypothesis. The probability of a Type I error is denoted by the alpha level or significance level (alpha - a) and is usually set at 0.05, or 0.01.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Type 2 error

A

Type II error occurs when we fail to reject a false null hypothesis. The probability of a Type II error is denoted by Beta and depends on factors such as sample size, effect size, and alpha level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are parametric tests?

A

A parametric test is one that assumes the sample data being tested comes follows a particular distribution, with a fixed set of parameters.

Examples: t-test, z-test, ANOVA, regression analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What key assumptions are made about the data in parametric tests? (3)

A

Normality: The data follows a normal distribution
Homogeneity of variance: The variance of the data is equal across groups or samples
Independence: The observations are independent from each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are advantages of parametric tests? (3)

A

High statistical power: Able to detect small differences between groups

Precise: Provide highly accurate estimates of population parameters

Wide applicability: Can be applied to many study designs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the disadvantages of parametric tests? (3)

A

Strict assumptions, sensitivity to outliers, sample size requirements*.

*May require larger sample sizes to produce reliable results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are one sample parametric tests?

A

One sample parametric tests are used to compare the sample mean to a known population mean.

Examples: One-sample t-test, one-sample z-test, one-sample mean confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a one-sample t-test?

A

A one-sample t-test is a parametric test used to test whether the mean of a sample is equal to a known population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When should a one-sample z-test be used instead of t-test?

A

When the population standard deviation is known and the sample size is <30.

17
Q

One-sample mean confidence interval

A

A one-sample parametric test used to estimate the range of values which the true population mean likely falls

18
Q

What are two sample parametric tests?

A

Two sample parametric tests are used to compare the means of two independent groups.

Examples: Independent samples t-test, Paired samples t-test, ANOVA

19
Q

What are non-parametric tests?

A

Non-parametric tests are statistical tests that do not assume a specific distribution for the population.

20
Q

When should a non-parametric test be used? (3)

A
  1. When the median more accurately represents the centre distribution/non-normal distribution
  2. Unequal variances
  3. Small sample size
21
Q

Advantages of non-parametric tests (3)

A
  1. Robust - They are less sensitive to outliers and non-normality in data
  2. Wide applicability - They can be used with many different study designs and data types
  3. Ease of interpretation - They often have simple and intuitive interpretation
22
Q

Disadvantages of non-parametric tests (3)

A
  1. Lower statistical power - less sensitive, therefore require higher sample sizes to detect significant differences
  2. Limited scope - some tests are limited by type of study design or data types
  3. Less precise - they provide less precise estimates of population parameters
23
Q

How do you determine to use parametric or non-parametric tests? (2)

A
  1. If the mean more accurately represents the centre of distribution of the data, and the sample size is large, use a parametric test.
  2. If the median more accurately represents the centre of distribution, use a non-parametric test
24
Q

Significance level

A

The significance level, represented by alpha, is the probability of rejecting a null hypothesis when it is true.
Most commonly it is 0.05, and 0.01.

Example: A significance level of 0.05 means there is a 5% chance of rejecting a null hypothesis that is actually true

25
Q

P-value

A

The p-value is the probability of the observed outcome occurring, assuming that the null hypothesis (no effect or difference) is true.

p-value > a = Accept H0
p-value < a = Reject H0

26
Q

p-value > a

A

Accept H0

27
Q

p-value < a

A

Reject H0

28
Q

Z-value

A

z-value, also known as z-score, is a measure of how many standard deviations a data point is away from the mean. It is used to compare a sample mean to a population mean.

z-value > critical value @ a = Reject H0
z-value < critical value @ a = Accept H0

29
Q

z-value > critical value

A

= Reject H0

30
Q

z-value < critical value

A

= Accept H0

31
Q

A z-value of 1.5 means what?

A

A z-value of 1.5 means that the observed value for the sample is 1.5 standard deviations away from the mean

32
Q

Confidence interval

A

A confidence interval is a range of values around a sample estimate that is likely to contain the true population parameter, with a certain degree of confidence. Common intervals are 90%, 95%, and 99%.

33
Q

Degree of Freedom

A

The degree of freedom (df) is a parameter used in statistical tests that determines the number of independent observations in a sample. It is important for calculating critical values for tests such as t-tests and F-tests. As the df increases, critical value decreases, indicating a greater likelihood of rejecting the null hypothesis.

34
Q

What is the degree of freedom (df parameter) important for?

A
  1. It is important for calculating critical values for t-tests and f-tests
  2. It affects the precision of confidence intervals, with a larger df leading to a narrower confidence interval, and more accurate model
35
Q

Types of Distribution

A

Normal, log, skew, kurtosis

36
Q

Deciding statistical tests: Categorical data (2)

A

Categorical nominal - Chi-squared
Categorical ordinal - Non-parametric