Descriptive and Inferential Statistics Flashcards

1
Q

Difference between descriptive and inferential statistics

A

Descriptive statistics - Methods for organising and summarising a set of data that help to describe the attributes of a group or population.

Inferential statistics - Statistical methods used to draw conclusions from a sample and make inferences to the entire population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Differentiate between the three types of variables

A
  1. Nominal (or Categorical)
    - with categories that are not ordered
    - e.g. gender, race, smoking status, blood group
  2. Ordinal
    - with categories that are ordered
    - e.g. cancer stages, pain rating, Likert scale data
  3. Continuous (or interval)
    - with real values that reflect order and relative magnitude
    - e.g. age, height, weight
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Appropriate way to describe nominal data numerically and graphically

A

Numerically summarised as frequency (n) and proportion (%).

Graphically can be presented as pie chart, bar chart.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Appropriate way to describe ordinal data numerically and graphically

A

a) . For most ordinal data:
- Numerically summarised as frequency (n) and proportion (%).
- Graphically, can be presented as pie chart, bar chart.

b). For Likert scale data:
- Numerically summarised as frequency (n) and proportion (%).
- Graphically, can be presented as pie chart, bar chart.
OR
- Numerically summarised as median and interquartile range (IQR)
- Graphically presented as box plot (box-and-whiskers plot).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Appropriate way to describe continuous data numerically and graphically

A
  • Numerically, summarised as a measure of central tendency (e.g. mean, median) with measure of variability (standard deviation, IQR).
  • Graphically, can be presented as a histogram, box plot.
  • Normally distributed continuous data are numerically summarized as mean and SD, while non-normally distributed continuous data is numerically summarized as median and IQR.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Difference between parameter estimation and hypothesis testing

A

Parameter Estimation

  • Seeks an appropriate calculation of a population parameter.
  • E.g. By how much does this new drug reduce blood pressure?
  • Methods: Point estimate, interval estimate

Hypothesis Testing

  • Seeks to validate a supposition based on limited evidence, inferred using a sample from the population.
  • Eg. Does this new drug reduce blood pressure?
  • Methods: Null hypothesis, alternative hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the different components of parameter estimation?

A
  1. Sampling distribution of the mean
  2. Central Limit Theorem
  3. Point Estimate
  4. Interval estimate (confidence interval, CI)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

sampling distribution of the mean

A
  • Repeated random samples of size n are taken
  • The mean is computed for each sample
  • The means of all random samples are used as the data
  • The mean of the sample means is equal to the population mean (μ)
  • The standard deviation of the sample means is equal to the population standard deviation (σ) divided by the square root of the sample size, also known as the standard error of the mean (SEM).
    » Quantification of the variability of the sample mean values
    » Used to estimate the precision or reliability of a sample, and is used in the calculation of confidence intervals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

central limit theorem

A

For sufficiently large sample sizes, the sampling distribution of the mean is approximately normally distributed, even if the underlying distribution of the individual observations is not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

point estimate

A
  • Involves the use of sample data to calculate a single number
  • E.g. sample mean (x̄) to estimate population mean (μ)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

interval estimate (confidence interval, CI)

A
  • Provides a range of reasonable values that are intended to contain the parameter of interest (e.g population mean (μ))
  • 95% CI: if data collection and analysis could be replicated, the CI should include within it the true value of the measure 95% of the time.
  • Provides information on the precision of the point estimate.
  • Width of CI is influenced by 3 factors:
    » Confidence level (e.g 90%, 95%, 99%)
    » Sample size (n)
    » Standard deviation (σ)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

the narrower the 95% confidence interval, the ____ precise the point estimate.

A

more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Principles of hypothesis testing

A
  • Null hypothesis (H0): no difference or no relationship or no effect
  • Alternative hypothesis (H1): there is a difference or relationship or effect
  • Statistical decisions based on p-value:
    » p < 0.05 leads to rejection of the null hypothesis and acceptance of the alternative hypothesis. Result is statistically significant at significance level of 0.05.
    » p ≥ 0.05 leads to retention of the null hypothesis. Result is not statistically significant at significance level of 0.05.
  • The smaller the p-value, the stronger the evidence against H0.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is:

  1. Type I error
  2. Type II error
  3. Statistical power
A
  1. Type I error: an error that occurs during the hypothesis testing process when a null hypothesis is rejected, even though it is accurate and should not be rejected.
  2. Type II error: an error that occurs when one fails to reject a null hypothesis that is actually false.
  3. Statistical power:
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is confidence interval more informative than p-value?

A
  • CI is more informative than p-value, as CI provides information on:
    » Precision of the point estimate (e.g. mean difference, odds ratio)
    » Statistical significance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the importance of differentiating between statistical significance and clinical significance?

A
  • Statistical significance is heavily dependent on the study’s sample size:
    » With large sample sizes, even small treatment effects can appear statistically significant.
    » With small sample sizes, even large treatment effects can appear not statistically significant.
    » Hence, do not just simply look at whether the result is statistically significant or not. Look at the point estimate and 95% CI to interpret the clinical significance of the result.
17
Q

What do we have to consider when comparing data between/among groups?

A
  • The number of groups being compared
  • Whether the groups are independent or paired/related
  • Whether the data are continuous, ordinal or nominal
    » For continuous data: whether the data are normally distributed or not.