Genomics and Health Flashcards
What are the traditional routes to gene discovery vs state of the art routes?
Traditional:
= determine mode of inheritance
= recombination mapping using markers
= haplotype analysis of recombinants
= rapid screening of candidate genes for mutations
= confirm by Sanger sequencing
State of the Art:
= whole exome next generation sequencing
= lots of polymorphisms
= filter polymorphisms to give list of candidate genes
= confirm by sanger sequencing
What is next generation sequencing?
EXTRA READING
= includes variety of methods
= use massively parallel processing to sequence millions of DNA fragments in single run (simultaneous)
= Library preparation
= Sequencing
= Base calling
= Data processing
= Variant calling
= Data Analysis
advantages
= speed, cost-effectiveness, high-throughput, accuracy
What is exon capture?
EXTRA READING
= sequencing specific regions of genome such as exons
= selectively isolates and enriches DNA fragments of interest from a sample (= better depth and accuracy)
= DNA fragmentation
= Adapter ligation
= Hybridisation
= Capture
= Amplification
= Sequencing
What is the filtering criteria for disease?
= mutation likely to cause a change in gene expression or protein structure
(e.g. nonsense, strong mis-sense, splice site changes, frameshifts)
= mutation not commonly found in SNP databases or control genome sequences
= alleles in affected individuals correspond to mode of inheritance
(e.g. two mutant alleles for recessive condition)
= same gene mutated in affected, unrelated individuals
= no unaffected individuals with putative disease-causing genotype
e.g. Direct identification of the UVSS-A causal gene by exome sequencing of subjects Kps3 and XP24KO
= UV sensitivity = skin discolouration
What is an example of Identification of de novo mutations?
= Cantú Syndrome
= dominant (lethal) inheritance
= congenital hypertrichosis (excess hair)
= distinctive facial features
= osteochondrodysplasia
= cardiac defects
= Harakalova et al = whole exome sequencing of an affected child + unaffected parents
= 5 candidate genes Sanger sequenced
= mutation confirmed in ABCC9 gene (ATP-dependent K+ transport)
= Arg to Gln = at position 1154
= mutations in ABCC9 confirmed in 13 out of 15 Cantu patients (absent from 5000 control exomes)
What are de novo mutations?
‘New’ mutations
= sequence affected child and unaffected parents
= average of 74 de novo point mutations (in whole genome)
= apply the filtering criteria
= 80% paternal, 2 additional de novo mutations for each year of paternal age
(sperm more susceptible to mutation)
= often discovered through whole exome sequencing
What do sequencing panels of genes do?
e.g. exon capture or exon amplification for subset of genes
= all possible alleles can be identified in one test
= e.g. 450 gene panel for eye disorders - 8 tests
What is whole exome sequencing used for?
Not routine in diagnosis - when all else fails
Often simultaneous gene identification (research) + diagnosis
All mutations can be identified
(ethical questions surrounding incidental findings)
Only 85-95% coverage of protein-coding regions
(technical limitations)
Produces huge amounts of data
What is an example of a clinical whole exome sequencing (WES) trial study?
WES of 250 people with undiagnosed disorders
(@ Baylor College of Medicine)
= most neurological disorders
= positive diagnosis in 25% of cases
= 33 autosomal dominant
= 16 autosomal recessive
= 9 X-lined
BUT exome only
= mutations in regulatory regions may be important
30 patients had unrelated genetic variants
How is DNA sequencing used in Medicine?
Department of health
= 100,000 genomes project
= completed in 2018
= actionable findings for 20-25% rare disease patients
= approx. 50% of cancer cases contain potential for therapy / clinical trial
Uses
= cancer (cancer vs normal genome)
= rare inherited conditions (identify mutations)
= infectious diseases (pathogen genomes and patient genomes)
= new routes for diagnosis
= personalised medicine (targeted treatments)
BUT homozygous loss of function often appears to have no effect
What is SNP analysis?
SNP
= single nucleotide polymorphism
= single base pair that commonly (>1%) varies in human population
(as opposed to rare mutation)
= generally refer to one strand
= variants are alleles
= genotype may be homozygous or heterozygous
= about 40 million SNPs
Analysis
= focuses on approx. 1% single nucleotide differences between individuals
= can be very high throughput
= useful in association studies, genetic ancestry and diagnosis of chromosomal disorders
What are Haplotypes?
= arrangement of SNPs on a chromosome
SNPs within a block can stay associated for many generations
4 - 6 alternative haplotypes for each block
approx. 20 SNPs per block
humans are haplotype mosaics
EXTRA READING
= sets of closely linked genetic variants that tend to be inherited together
= blocks of DNA
= haplotype = combination of SNPs found in particular block of DNA
= can be used to track inheritance of specific genetic variants + identify regions of genome associated with certain traits
How is Genome-Wide Association used?
SNPs array used to genotype individuals with and without the trait
Higher incidence of SNP allele in individuals with the trait vs to those without the trait
= ASSOCIATION
Significance depends on degree of association and number of individuals tested
= very low P value used for significance to avoid false positives
Nearby (linked) candidate genes identified and tested
EXTRA READING
= examines relationship between variations in genome and traits or diseases
= studies involve:
selection of participants
DNA sequencing
Statistical anlysis
Follow-up studies
What is pharmacogenomics?
Pharmacogenetics on a genome-wide scale
= attempty to identify SNPs associated with sensitivity / side effects
Difficult to get large sample sizes
= adverse reactions are rare
Validation by replication difficult
Obvious phenotypic effects of adverse reactions or overdose
EXTRA READING
= study how genetics affects response to drugs
e.g. warfarin response - CYP2C9 and VKORC1
What are some significant genome-wide associations for drug response?
What are some significant genome-wide associations for side effects?
Are Genome-Wide Association studies worth it?
Statistical association only
= low predictive value (odds ratio <1.5)
approx. 90% of associated SNPs are in non-coding regions of DNA
What combinations of gene (rather than SNP) variants are associated with high risk?
Are there environmental factors?
Why are there large differences between twin studies and GWAS? (Missing heterability)
may be due to:
= lots of variants of small effect
(maybe due to the strict p value)
= there may be rare variants, missed by GWAS
= SNP interactions (epistasis) rather than an additive model
= Genetic and environmental interactions missed
= heritability over-estimated in twin studies
(instead more about environment)
Polygenic risk scores are probably the way forward
What are predictive health genomics?
SNP test (or fully genome sequence) all newborns
Identify genetic risk factors
Tailer lifestyle and medical interventions appropriately
= likely to be effective only for highly penetrant rare variants
= conventional health monitoring is easier, cheaper and more reliable
= health advice generally appropriate to everyone
EXTRA READING
e.g. BRCA1 and BRCA2 - breast cancer
BUT there are ethical concerns about genetic privacy, discrimination