Collecting Data about people and Comparing the Health of Groups - Ovarian Cancer Flashcards

1
Q

Why compare the health of groups?

  • Research questions such as:
    • Is this disease increasing in …?
    • Does it occur with … frequency in my local community?
    • Is incidence associated with some suspected … …?
    • Has the outcome changed since … measures were instituted?
  • Differences between groups at a … in time / Differences between groups … time
A
  • Research questions such as:
    • Is this disease increasing in prevalence?
    • Does it occur with undue frequency in my local community?
    • Is incidence associated with some suspected risk factor?
    • Has the outcome changed since control measures were instituted?
  • Differences between groups at a point in time / Differences between groups over time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we compare the health of groups?

  • Cross-sectional study -> one group surveyed to test associations between … and …/s
  • Ecological study -> community/population observed to test associations between … and …/s
    • Both studies focus on simultaneous observation of … and …, but difference is unit of observation
    • Cross-sectional study – focus on … level data
    • Ecological study – focus on …-level data
A
  • Cross-sectional study à one group surveyed to test associations between exposures and outcome/s
  • Ecological study à community/population observed to test associations between exposures and outcome/s
    • Both studies focus on simultaneous observation of exposure and outcome, but difference is unit of observation
    • Cross-sectional study – focus on individual level data
    • Ecological study – focus on population-level data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do we compare the health of groups? (2)

  • Cohort study -> …-… participants followed up to see if they develop a … (condition/outcome) of interest
    • Usually with groups who differ at outset on some …/s of interest
  • Case–control study -> groups who differ at outset on … (condition/outcome) …
    • Look back at …/s of interest
  • Randomised controlled trial (RCT) -> groups who are randomly allocated to receive …/s versus …/s
    • Test safety and efficacy/effectiveness of interventions
A
  • Cohort study -> disease-free participants followed up to see if they develop a disease (condition/outcome) of interest
    • Usually with groups who differ at outset on some exposure/s of interest
  • Case–control study -> groups who differ at outset on disease (condition/outcome) status
    • Look back at exposure/s of interest
  • Randomised controlled trial (RCT) -> groups who are randomly allocated to receive intervention/s versus comparator/s
    • Test safety and efficacy/effectiveness of interventions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Case-control studies

  • Case–control study -> groups who … at … on disease (condition/outcome) …
    • Two groups of participants are selected – one … (cases) and one … (controls)
    • Controls selected to be as … as possible to the cases (e.g. age, gender, occupation, stage of illness, etc.)
      • Variables not of interest are matched (i.e. potential …) at selection
      • Are Exposures of interest measured or matched at selection?
  • Always … -> Past exposure/s in both groups E.g. interview/survey, historical records
A
  • Case–control study -> groups who differ at outset on disease (condition/outcome) status
    • Two groups of participants are selected – one with condition (cases) and one without (controls)
    • Controls selected to be as similar as possible to the cases (e.g. age, gender, occupation, stage of illness, etc.)
      • Variables not of interest are matched (i.e. potential confounders) at selection
      • Exposures of interest are not measured or matched at selection
  • Always retrospective -> Past exposure/s in both groups E.g. interview/survey, historical records
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are case-control studies?

A
  • groups who differ at outset on disease (condition/outcome) status
    • Look back at exposure/s of interest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What type of study?

A

Case Control Studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Case-control studies

  • We cannot calculate … using case-control data
    • Because … = probability of … the outcome of interest
  • In case-control studies:
    • we have … the outcome
    • i.e. we have selected participants into the case group or control group
    • we have decided the … of the groups
  • Therefore, we cannot calculate … …
  • Instead we calculate the odds of cases and controls in terms of their … …
    • We calculate an … ratio (OR) which is very similar to the … … (RR)
A
  • We cannot calculate risk using case-control data
    • Because risk = probability of developing the outcome of interest
  • In case-control studies:
    • we have determined the outcome
    • i.e. we have selected participants into the case group or control group
    • we have decided the size of the groups
  • Therefore, we cannot calculate relative risk
  • Instead we calculate the odds of cases and controls in terms of their past exposures
    • We calculate an odds ratio (OR) which is very similar to the relative risk (RR)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Case-control studies - Example

  • Case group - developed ovarian cancer
  • Control group - … to case group - no ovarian cancer
  • This example study shows that for African-American women, development of ovarian cancer was associated with:
    • … odds of 1 year prior to diagnosis, having a BMI of 25 or over
    • … odds of having completed post-high school education
A
  • Case group - developed ovarian cancer
  • Control group - similar to case group - no ovarian cancer
  • This example study shows that for African-American women, development of ovarian cancer was associated with:
    • Greater odds of 1 year prior to diagnosis, having a BMI of 25 or over
    • Reduced odds of having completed post-high school education
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Strengths of the Case Control Studies

  • Can offer some evidence of … – … relationship i.e. association between … and …
  • Can identify multiple exposures (both … and … associations)
  • Good when disease/outcome is …
  • Minimises selection and information …
  • R.. - cheaper and typically shorter in duration
A
  • Can offer some evidence of causeeffect relationship i.e. association between exposure and outcome
  • Can identify multiple exposures (both positive and negative associations)
  • Good when disease/outcome is rare
  • Minimises selection and information bias
  • Retrospective - cheaper and typically shorter in duration
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Weaknesses of the Case Control Studies

  • Cannot calculate … or …
  • Less suitable for … exposures
  • Can be hard to ensure exposure occurred … onset
  • Retrospective data availability and quality may be …
  • Suitable … group may be difficult to find
  • Vulnerable to …
A
  • Cannot calculate prevalence or incidence
  • Less suitable for rare exposures
  • Can be hard to ensure exposure occurred before onset
  • Retrospective data availability and quality may be poor
  • Suitable control group may be difficult to find
  • Vulnerable to confounding
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The Randomised Controlled Trial

  • a study in which participants are allocated randomly between an … (e.g. treatment) and a … … (e.g. no treatment or standard treatment)
A
  • a study in which participants are allocated randomly between an intervention (e.g. treatment) and a control group (e.g. no treatment or standard treatment)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why are RCTs conducted?

  • Safety:
    • ​Ascertain the safe … of a new drug.
    • Demonstrate safety and t… of a new …
    • Monitor … events profile of a new drug (against an existing drug or placebo)
  • Efficacy/Effectiveness
    • Demonstrate efficacy of new drug – does it …?
    • Show that treatment T is … or … to treatment X
    • Demonstrate effectiveness, and …-effectiveness, of A vs. B
A
  • Safety:
    • ​Ascertain the safe dose of a new drug.
    • Demonstrate safety and tolerability of a new compound
    • Monitor adverse events profile of a new drug (against an existing drug or placebo)
  • Efficacy/Effectiveness
    • Demonstrate efficacy of new drug – does it work?
    • Show that treatment T is superior or equivalent to treatment X
    • Demonstrate effectiveness, and cost-effectiveness, of A vs. B
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

RCTs as an experiment

  • RCTs are also a special type of experiment in which randomisation is used
  • Randomisation means that potential … variables should be … distributed between groups
  • This creates two situations which are identical but:
    • One situation in which the supposed cause (intervention of interest) is …
    • One situation in which the supposed cause is …
  • RCTs can reduce … and allow identification of exposures which are … related to disease of interest
    • i.e. identification of interventions which cause reduction in disease likelihood or severity
A
  • RCTs are also a special type of experiment in which randomisation is used
  • Randomisation means that potential confounding variables should be equally distributed between groups
  • This creates two situations which are identical but:
    • One situation in which the supposed cause (intervention of interest) is present
    • One situation in which the supposed cause is absent
  • RCTs can reduce confounding and allow identification of exposures which are causally related to disease of interest
    • i.e. identification of interventions which cause reduction in disease likelihood or severity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Strengths of the RCT

  • Establish the s… and e…/ e… of new interventions
  • Minimise selection and information …
  • Best single-study evidence for … association between exposure (intervention) and outcome
A
  • Establish the safety and efficacy/ effectiveness of new interventions
  • Minimise selection and information bias
  • Best single-study evidence for causal association between exposure (intervention) and outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Weaknesses of the RCT

  • …-consuming, difficult and e…
  • Not immune to …
  • Issues with participant …-…
  • Can lack g…
A
  • Time-consuming, difficult and expensive
  • Not immune to bias
  • Issues with participant drop=out
  • Can lack generalisability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What data can we collect?

  • Data properties to consider:
  • Categorical (D…)
    • B.. variables- e.g. Are you a parent: yes/no; diagnosis of endometriosis (yes/no)
    • … categories but no order (unordered categorical) – e.g. Sexual orientation; Smoking (never, former, current);
    • Several ordered categories (ordinal) – e.g. Socio-economic status; categorised age; responses on Likert scale
  • Continuous variables (S…)
    • E.g. age; no. of symptoms; score on a questionnaire such as quality of life or satisfaction
A
  • Data properties to consider:
  • Categorical (discrete)
    • Binary variables- e.g. Are you a parent: yes/no; diagnosis of endometriosis (yes/no)
    • Several categories but no order (unordered categorical) – e.g. Sexual orientation; Smoking (never, former, current);
    • Several ordered categories (ordinal) – e.g. Socio-economic status; categorised age; responses on Likert scale
  • Continuous variables (scale)
    • E.g. age; no. of symptoms; score on a questionnaire such as quality of life or satisfaction
17
Q

Using data to compare health outcomes of groups

  • Does one group report higher scores (on a … scale) than another?
    • E.g. is condition or intervention X associated with reduced symptoms compared to condition/intervention Y?
    • …-test or ANOVA
  • Does one group have a higher proportion of an outcome than another (c…)?
    • E.g. is condition/intervention X associated with increased frequency of diagnosis P?
    • … … test
A
  • Does one group report higher scores (on a continuous scale) than another?
    • E.g. is condition or intervention X associated with reduced symptoms compared to condition/intervention Y?
    • T-test or ANOVA
  • Does one group have a higher proportion of an outcome than another (categories)?
    • E.g. is condition/intervention X associated with increased frequency of diagnosis P?
    • Chi squared test
18
Q

Error and Power

  • Protection against Type I error = threshold for determining when effects are …
    • Significance level for p value is typically accepted at …% (or …) i.e. …% chance of type I error
  • Protection against Type II error = power of the study to … when … effects are present
    • Statistical power is typically accepted at … – …% (or ..-..) i.e. …% or …% chance of type II error
    • The larger the sample, the … the statistical power
A
  • Protection against Type I error = threshold for determining when effects are significant
    • Significance level for p value is typically accepted at 5% (or 0.05) i.e. 5% chance of type I error
  • Protection against Type II error = power of the study to detect when significant effects are present
    • Statistical power is typically accepted at 8090% (or 0.8 – 0.9) i.e. 20% or 10% chance of type II error
    • The larger the sample, the larger the statistical power
19
Q

Two aspects of assessing certainty of estimates - P values and Confidence Intervals

  • P values = p…
    • When you compare groups using a statistical test (t test, χ2), the result has a p value.
    • The p value is the … that the difference observed (or one more extreme) could have occurred by … if the groups compared were really …
      • E.g. P 0.05 = 1/20, P 0.01 = 1/100
  • P value = type … … protection
  • Larger sample = … p value
A
  • P values = Probability
    • When you compare groups using a statistical test (t test, χ2), the result has a p value.
    • The p value is the probability that the difference observed (or one more extreme) could have occurred by chance if the groups compared were really alike.
      • E.g. P 0.05 = 1/20, P 0.01 = 1/100
  • P value = type I error protection
  • Larger sample = smaller p value
20
Q

Two aspects of assessing certainty of estimates - P values and Confidence Intervals

  • Confidence Intervals = P…
    • The confidence interval describes the range of values with a given … (e.g. 95%) that the true value of a variable is … within that …
A
  • Confidence Intervals = Precision
  • The confidence interval describes the range of values with a given probability (e.g. 95%) that the true value of a variable is contained within that range.
21
Q

T-tests – what’s being tested?

  • The (…) hypothesis: There is no difference in quality of life between women who have epithelial ovarian cancer (EOC) versus an ovarian germ cell tumour (OGCT)
    • The mean quality of life in the EOC group is the same as the mean quality of life in the OGCT group
  • The (…) hypothesis: There is a difference in mean quality of life between the groups (greater in EOC group)
  • The between-group difference needs to be … if there is more within-group variability
    • e.g. if women vary greatly in quality of life within each group, we would need to see a bigger between-group difference to be sure it is a between-group difference
  • Test output: …-statistic and accompanying …-value
A
  • The (null) hypothesis: There is no difference in quality of life between women who have epithelial ovarian cancer (EOC) versus an ovarian germ cell tumour (OGCT)
    • The mean quality of life in the EOC group is the same as the mean quality of life in the OGCT group

The (alternative) hypothesis: There is a difference in mean quality of life between the groups (greater in EOC group)

  • The between-group difference needs to be greater if there is more within-group variability
    • e.g. if women vary greatly in quality of life within each group, we would need to see a bigger between-group difference to be sure it is a between-group difference
  • Test output: t-statistic and accompanying p-value
22
Q

T-test – Hypothesis: Quality of life is higher for women with epithelial ovarian cancer (EOC) versus an ovarian germ cell tumour (OGCT)

  • Conclusion?
  • (look at P value)
A
  • Conclusion: There is not enough evidence to suggest quality of life is higher for women with epithelial ovarian cancer (EOC) versus an ovarian germ cell tumour (OGCT) (P=0.10) - can’t reject null / accept alternative
23
Q

Chi-square tests – what’s being tested?

  • The (null) hypothesis: There is … … between endometriosis and ovarian cancer.
    • The proportion of women with endometriosis who have ovarian cancer is the … as the proportion of women who do not have endometriosis and have ovarian cancer.
  • The (alternative) hypothesis: There is … … between endometriosis and ovarian cancer.
    • The proportion of women with endometriosis who have ovarian cancer is … … … as the proportion of women who do not have endometriosis and have ovarian cancer.
  • Test output: …-… (χ2) statistic and accompanying ….-value
A
  • The (null) hypothesis: There is no association between endometriosis and ovarian cancer.
    • The proportion of women with endometriosis who have ovarian cancer is the same as the proportion of women who do not have endometriosis and have ovarian cancer.
  • The (alternative) hypothesis: There is an association between endometriosis and ovarian cancer.
    • The proportion of women with endometriosis who have ovarian cancer is not the same as the proportion of women who do not have endometriosis and have ovarian cancer.
  • Test output: chi-squared (χ2) statistic and accompanying p-value
24
Q

Chi-Square- two groups with categorical outcome - Hypothesis - Endometriosis increases likelihood of having Ovarian cancer

  • P value indicates what?
  • Reject or accept null hypothesis? (There is no association between endometriosis and ovarian cancer.)
A
  • Strong evidence to reject the null hypothesis of no association between endometriosis and ovarian cancer
    • P value is well below the threshold at which is appears that the data we have has not come about by chance
25
Q

P-Value and Statistical Significance

  • A p-value less than … (typically ≤ …) is statistically significant.
  • It indicates strong evidence against the … hypothesis, as there is less than a …% probability the null is correct (and the results are …)
  • Therefore, we reject the … hypothesis, and accept the … hypothesis
A
  • A p-value less than 0.05 (typically ≤ 0.05) is statistically significant.
  • It indicates strong evidence against the null hypothesis, as there is less than a 5% probability the null is correct (and the results are random).
  • Therefore, we reject the null hypothesis, and accept the alternative hypothesis
26
Q

Confidence intervals

  • Confidence intervals (CIs) = range of values that should contain the … population parameter
  • Use current sample to estimate shape of actual distribution of variable in population – and imagine how much observed sample values would vary if kept running the study
  • Create an … and … bound a certain distance above and below the mean, e.g. 95% - these upper and lower bounds converted back to real scores
  • 95% CIs would contain the true population value …% of the time and fail to contain the true value …% percent of the time
  • CIs become … as the sample size increases – the larger the sample, the more likely it is that scores will cluster narrowly around the true population mean
A
  • Confidence intervals (CIs) = range of values that should contain the true population parameter
  • Use current sample to estimate shape of actual distribution of variable in population – and imagine how much observed sample values would vary if kept running the study
  • Create an upper and lower bound a certain distance above and below the mean, e.g. 95% - these upper and lower bounds converted back to real scores
  • 95% CIs would contain the true population value 95% of the time and fail to contain the true value 5% percent of the time
  • CIs become narrower as the sample size increases – the larger the sample, the more likely it is that scores will cluster narrowly around the true population mean
27
Q

Confidence intervals example

  • E.g. Odds of exposures for African-American women with ovarian cancer
    • OR < 1 = … association
    • OR > 1 … factor
    • OR = 1, … association
  • Precision of example estimates fairly good, i.e. the 95% CIs are not very wide
  • But, for most exposures, do contain 1 (i.e. true population could be OR =1, … association)
  • So can’t be very sure or confident about making clinical recommendations/ predictions in light of these data
A
  • E.g. Odds of exposures for African-American women with ovarian cancer
    • OR < 1 = protective association
    • OR > 1 risk factor
    • OR = 1, no association
  • Precision of example estimates fairly good, i.e. the 95% CIs are not very wide
  • But, for most exposures, do contain 1 (i.e. true population could be OR =1, no association)
  • So can’t be very sure or confident about making clinical recommendations/ predictions in light of these data
28
Q

Summary - Collecting Data about People and Comparing Health of Groups

  • To measure existing health needs of populations at one time = … and over time = …
  • To compare the health needs/outcomes of different groups = measures of risk, differences in frequency and/or means
  • Comparing groups can allow us to identify casual risk factors (…) relevant to the disease or condition of interest
    • … studies (e.g. case control) – describe population, identify potential causal exposure factors
    • … studies (e.g. RCT) – test safety and efficacy/effectiveness (and best evidence of cause-effect)
  • Selecting appropriate statistical test = consider nature of data
    • Continuous data e.g. quality of life = …-test or ANOVA
    • Categorical data e.g. disease presence or absence = …-… test
  • In collecting data, we need to consider certainty
    • How to design a study that is adequately powered
    • Provide estimates of probability (… value) and estimates of precision (… intervals)
A
  • To measure existing health needs of populations at one time = prevalence and over time = incidence
  • To compare the health needs/outcomes of different groups = measures of risk, differences in frequency and/or means
  • Comparing groups can allow us to identify casual risk factors (exposures) relevant to the disease or condition of interest
    • Observational studies (e.g. case control) – describe population, identify potential causal exposure factors
    • Experimental studies (e.g. RCT) – test safety and efficacy/effectiveness (and best evidence of cause-effect)
  • Selecting appropriate statistical test = consider nature of data
    • Continuous data e.g. quality of life = t-test or ANOVA
    • Categorical data e.g. disease presence or absence = chi-squared test
  • In collecting data, we need to consider certainty
    • How to design a study that is adequately powered
    • Provide estimates of probability (p value) and estimates of precision (confidence intervals)