Investigating the Genome Flashcards
Definition of genome annotation
Identifying the locations of genes and all coding regions and determining what they do
Definition of actionable genes
Genes in which small variants have reported therapeutic, prognostic, clinical trial associations
Why do we use whole genome sequencing
Only sequencing technique that accommodates for
- large scale structural changes
- balanced translocations
- distant consanguinity
- uniparental disomy
- novel/known coding/non coding variants
What are the 4 benefits of whole genome sequencing
Whole genome is complete
The individual’s genome doesnt change
Potential to collect once, store, refer to again for clinical use
Only need to analyse each time for a specific question
Describe the key properties of Next Gen Sequencing
- cost
- how many fragments are sequenced and how
- fragment length?
- how many times is each position sequenced on average
Allows for cheap whole genome sequencing
Billions of random fragments sequenced in parallel
Fragment length = 150 bases x2
Each position is sequenced 30x on average
2 copies of genome in each cell, each sequenced 15x on average
Why is it important to sequence each position several times
Any variants between chains can be detected many times, less likely to be a mistake
What are the current 2 main limitations of current technology
Short reads of NGS make characterisation of large variants hard
-most genomes sequenced with NGS => knowledge of normal structural variants is limited
Accuracy lower than older more costly sequencing tech
- variants detected by NGS, verified with Sanger
- Involves use of primers to target variant
What are the 2 sources of info about variants
Functional annotation of the reference genome
Occurence between affected and unaffected individuals
What is genome annotation
Identifying gene locations of genes and coding regions and determining what they do
How would you identify causal variants in rare disease
Filter out commonly observed variants
Look for variants identified as pathogenic
Look for variants in genes linked to condition
Look for variants that affect functional elements
Look for variants that are normally conserved
How would you filter out commonly observed variants
MAP >= 1%
Use data from gnomAD (has data on exomes and whole genomes)
What do you need to look out for when looking for variants identified as pathogenic
Frequency of variant occurence only recently surveyed
=> many false +ve pathogenic variants
Where can you look to find variants that affect functional elements
Protein coding sequence
Splicing
Regulatory element
What are the 2 uses of WDS in
- research
- clinical diagnostics
Research
rare disease discovery
-sequences of groups of affected individuals
-look for genes sharing variants
Clinical diagnostics
rare disease diagnostics
-sequences of affected + other family members (affected/unaffected)
What are the current limited uses of WGS in clinical care
Limited to
- monogenic disease
- patients with a clear phenotype
- patients who are ill
Reporting limited to
-variants in protein coding sequences, easier to predict mutation effect
What can’t WGC be applied to and why
Diagnosis of complex disease
Prediction of risk
Patients with an unexplained condition
Due to limits in understanding
What is the function of the 100000 genomes project
A treatment project
All clinical whole genome sequencing
-rare disease (proband and parents or affected sibling)
-cancer (normal/tumour)
What is the eligibility criteria for taking part in the 100000 genomes project
Based on unmet clinical need
Phenotype is clear and listed
Likelihood that WGS will aid diagnoses
Patient has cancer/rare disease
What are the 4 benefits of the 100000 genomes project
Improve health of NHS patients
Stimulate wealth generation
Create legacy of infrastructure, human capacity, capability
Enable large scale genomics research
Describe the structure of Genomics England and the TGP
Genomics England contracts ext company to sequence DNA
NHS Genomic Medicine Center
- provides consent based identification and clinical care of patients
- sends DNA samples to ext sequencing company
Data center
- receives sequences and variants from ext company
- also receives phenotypes and presentations of disease inputted by clinicians
Data interpretation of sequences
-Ext companies must annotate sequences and send a clinical report back to NHS Genome Med Center
How do external researchers get involved in the genomes project
Partnership between researchers and NHS => accelerate development of diagnostics and therapies
All data generated contributes to Genomics England Dataset
What are the 3 types of finding from the GP
Main findings
-participants receive results on main condition
Additional findings
-Can opt in to get feedback on known genetic alterations of high clinical significance
Carrier status
-Can opt in to find out carrier status for some genetic diseases
What are the possible outcomes from the GP
Identify
- spontaneous mutation not found in parents
- underlying mutations => many health problems
- genetic variants within families
- newly recognised disease genes
- promotor mutations => affects disease gene
- potential drug reactions with found gene variants
Compare the diagnosis success rates of rare diseases
- current single gene tests
- WGS
Current single gene tests => 15-20% diagnosed
100000GP => + 30% diagnosed
Compare the diagnosis success rates of cancer
- current single gene tests
- WGS
Current single gene tests => no existing cancer genetics service
100000GP => 60% of repeats identify variants in actionable genes
What happens to those who didnt get a diagnosis with 100000GP
Can have their data looked by researchers to pick up variants not detected by the computers