Genomics and health Flashcards
Routes to gene discovery and diagnosis
Traditional:
- Determine mode of inheritance
- Recombination mapping using markers
- Haplotype analysis of recombinants
- Confirm by Sanger sequencing
State of the art:
- Whole exome next generation sequencing
- Lots of polymorphisms
- ‘Filter’ polymorphisms give list to candidate genes
- Confirm by Sanger sequencing
Human genome project 1990-2003
- Representative human genome sequence of ~3 billion bases
- Performed using Sanger Sequencing
- Covered 92% of genome sequence
Cost of human genome
Decreased dramatically over last 20 years
Becoming more feasible to use sequencing for medical research and diagnosis
Reference genome
Reference genome forms the foundation of medical, function, and diversity studies
Provides common point of reference for genomic loci - Gives genes ‘addresses’, reported variants are relative to reference genome
Provides a template - guides assembly of new genomes, enables assay design and data analysis
Genetic variations can be characterised against reference genome
Single nucleotide polymorphism (SNPs)
Structural variants - Deletion, Insertion, Duplication, Inversion, Translocation, Copy number variation
Single nucleotide polymorphism (SNPs) is the most investigated variant type
- Ease of analysis
- Single nucleotide substitution
- Present at >1% of the population
- Roughly 4-5 million SNPs in each individual (~once every 1000 base pairs)
- Over 600 million SNPs has been reported around the world
- Single nucleotide variations (SNVs) – similar to SNPs, but without the requirement to be present at >1% of the population
What to consider when exploring genetic variation
Factors to consider when choosing a sequencing technology:
- Cost - (experimental, analysis, other)
- Time - (sample preparation, run time, analysis time and sample transport.)
- Information capture - (accuracy, feature length, complex variant detection)
- Choose the appropriate tools for the intended purposes
BRCA1 and BRCA2
BRCA1:
110kb/85kb in length (intron + exon)
0.006% of genome
7.8kb/10.2kb in length (exon only)
0.0005% of genome
Do we really need all the information across the genome/transcriptome?
Microarray
Enrichment/amplicon -> Gene panel sequencing
Enrichment/amplicon -> Exome sequencing
Amplicon sequencing uses PCR to amplify the genome regions of interest
Design PCR primers flanking the genes of interest (e.g. genes with mutation known to associate with cancer or diseases)
Amplify the region using PCR (amplicon)
Sequence only these regions
Example: gene panel testing (Lecture 5)
Select input + sequencing
DNA -> PCR/Hybridisation capture/Whole genome sequencing
PCR -> Amplicon sequencing
Hybridisation capture -> Target enrichment sequencing
Keywords for sequencing
Read - The sequence corresponding to a DNA fragment
Map - Determining where the reads originated from in a genome
Depth/coverage (fold, 4X, shallow/deep sequencing)