Data science in medicine: the data Flashcards
What is a research question that could be researched in the field of genomics?
(How) can we identify the disease causing mutation(s)?
What is the difference between a monogenic and complex disease?
- Monogenic disease → (usually rare) disease caused by variation in a single gene
- Complex disease → (usually common) disease that results from the contributions of multiple genomic variants and genes in conjunction with significant influences of physical and social environments.
How can monogenic and complex disease be identified?
- Monogenic → whole-exome sequencing (WES)
- Complex → Genome-wide association studies (GWAS)
How does whole-exome sequencing (WES) work?
Whole exome sequencing is a technique where all of the protein-coding regions of genes in a genome are sequenced.
- First the subset of DNA is selected that only encodes proteins (i.e. exons).
- After this, illumina sequencing can be applied to find out what the exome is of the patient.
- The patient’s exome can then be compared with reference DNA sequences.
- By identifying differences between the patient’s DNA and reference DNA, candidate mutations can be identified.
- By validating these candidate mutations, the disease causing mutation can be found.
How can we reduce the amount of candidate genes during exome sequencing and filtering of candidate genes?
By including more filters:
- list of differences: patients compared to reference
- list of mutations: shared between patients
- List of synonymous mutations
How can a candidate gene be identified as the causal gene (i.e. validated as the causal gene) with the use of exome sequencing?
- If the mutation is present in new patient samples.
- If → de novo mutation → absent in DNA of both parents
- Absent in control samples
What are limitations of exome sequencing?
- Technical failure: no capture probe, lack of sequence coverage, misalignment
- Filtering may accidently remove the causal variant
- Genetic heterogeneity makes finding the causal variant more difficult
What is whole-genome sequencing?
Instead of only sequencing the exome (-1% of the human genome), the whole genome is sequenced.
What are advantages and disadvantages of whole-genome sequencing?
Advantages:
- More sensitive, also non-coding mutations can be causal
- Can also be used to detect other chromosomal aberrations (translocations and large deletions/insertions)
Disadvantages:
- Higher costs
- Less power → more samples needed
What three steps are needed before tests such as whole genome sequencing can be implemented inside the clinic?
- Test development optimization
- Test validation
- Quality management
What is a genetic association study?
A study that is used to find candidate genes or genome regions that contribute to a specific disease by testing for a correlation between disease status and genetic variation.
What is Single Nucleotide Polymorphism (SNP)?
A point mutation that has persisted in the population.
There are two hypotheses regarding diseases and variants of complex diseases:
1. Many genetic variants underlying complex diseases are common.
2. Genetic contributions to complex diseases arise from many variants, all of which are rare.
Describe for which hypothesis it is more likely to use an association study to detect genetic variants.
Many genetic variants underlying complex diseases are common.
What is a Genome Wide Association Study (GWAS)?
A type of observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWAS typically ocus on associations between SNPs and traits like major human diseases.
What applications does GWAS have?
- Gaining insight into a phenotype’s underlying biology
- Estimating its heritability
- Making clinical risk predictions → polygenic risk score (PRS)
- Informing drug development programmes
- Inferring potential causal relationship between risk factors and health outcomes