The Genome's Content Flashcards
Name the four types of chromosome map from low to high resolution
Karyotypic
Linkage
Physical
Sequence
What is the scale of Linkage maps?
cM
What is another name given to “linkage maps” and how are they derived?
“genetic map”
via monitoring recombination frequencies between markers
What is the scale of physical chromosome maps?
bp or kp
How are karyotypic maps derived?
From microscopic observation of chromosomal spreads
What is the main end goal of all sequence projects?
To sequences all the bases along the chromosome in a sequence map
What is 1% recombination frequency equivalent to?
1cM
Why do we need lower resolution maps if higher resolution maps are the main end goal?
SOmetimes lower resolution maps are needed to generate the higher resolution maps
What is exome sequencing? Does this type of sequence follow the main goal seen in sequencing today?
Sequencing just coding DNA
No, the goal is to sequence entire genomes
Why were the first genomes sequenced from relatively simple model organisms? Name 3 of these model organisms
Relatively small genome size and genetic tractability ( easy to control)
E.coli, S.cerivisiae, C.elegans
When was the Human Genome Project initiated and when was it completed?
1990
2003
What era of genomic are we currently living in? What types of genomics has this led to?
the ‘post-genomics’ era
Large scale approaches which analyse large data sets to investigate gene function (functional genomics) and genome evolution/structure (comparative genomics)
What is forward genetics?
Phenotype to genotype
Identifying the gene mutations that are the cause of a specified phenotype
What is reverse genetics?
Genotype to Phenotype
Identifying the phenotype caused by a specific mutation
What is functional genomics?
The analysis of a gene’s function (this includes the protein’s function that is formed form the protein)
What is Comparative genomics?
Comparison of genome sequences and organisation between different organisms and how similarities/differences can show us their evolutionary relationship
WHat was the name of the privately funded genome sequence project and who established it?
Celera genomics
Craig Venter and Perkin Elma
What was Celera Genomics’ aim?
To sequence genomes quickly, control access of genes and patent them
What is the name of the genome sequencing method used by Celera genomics?
Whole genome shotgun
What is the name of the genome sequencing method used by the publicly funded project?
Hierarchical shotgun
What was the difference in cost of the public vs celera genome sequencing projects?
public = $3 billion
Celera = $300 billion
When did the publicly funded genome sequencing project announce its completition and when was it published in Nature?
2003 announced
2004 published (Nature)
What is another name given to Sanger sequencing?
(conventional) dideoxy DNA sequencing
Why is Sanger sequencing impractical for large projec that require the sequencing of large or multiple genomes?
It has a low throughput (only sequence a small number of bases at once) which therefore makes it expensive and impractical for large projects
What does NGS stand for?
Next Generation Sequencing
What is the benefit of using Next Generation Sequencing over Sanger sequencing?
NGS = high throughput low cost (by increasing speed to reduce cost)
Sanger = low throughput high cost
How does Next Generation Sequencing (NGS) technologies increase speed and reduce costs?
Undergoes “parallel sequencing” of millions of different fragments of DNA at the same time
What is Moore’s Law?
A prediction in historical trend that the number of transistors on a microchip doubles every two years (offering performance benefits over time) even though the cost of computers is halved
What is the difference seen when comparing Moore’s law with the change in cost per raw Megabase of DNA Sequence over time?
The decrease in cost of DNA sequencing is much faster than Moore’s law would predict
What is the importance of comparing Moore’s law to the change in cost per raw Megabase of DNA sequencing over time?
Cost of DNA sequencing is falling at an even faster rater than Moore’s law which highlights the rate of improvement in DNA sequencing technology
What was the name of the company that unveiled the first Next Generation Sequencing machine and in what year?
454 Life Sciences
in 2005
What is the name of the DNA sequencing method used by the first NGS machine by 454 Life Sciences?
‘454’ pyrosequencing
What molecule released upon nucleotide incorporation by DNA polymerase, is used in pyrosequencing?
Pyrophosphate (PPi)
How can pyrophosphate (released upon nucleotide incorporation by DNA polymerase) be utlised in pyrosequencing?
PPi is used as a fuel for a downstream set of reactions that ultimately produce light (by the action of luciferase of luciferin)
What are the 3 overall stages involved in the process of ‘454’ pyrosequencing?
library preparation
emulsion PCR
pyrosequencing
Roughly how many wells are there is pyrosequencing and how many beads are there per well?
1.6 million wells
1 bead per well
What’s added to pyrosequencing wells before dNTPs are added?
smaller enzyme beads
sequencing primer
DNA polymerase
the two substrates APS and Luciferin
What is important about how dNTPs are added to pyrosequencing wells?
they are added sequentially and in repeat cycles
Why is it important that dNTPs are added sequentially and in repeat cycles during pyrosequencing?
incorporation of a nucleotide results in light emission and the intensity is proportionally greater if there are 2 or more consecutive bases of the same type
What is a ‘homopolymer error’?
An error in the stated number of bases when a single nucleotide occurs more than once in a sequence (i.e because the sequence contains homopolymeric regions)
What type of error is pyrosequencing prone to and why is this?
Prone to homopolymer error
because it is difficult to measure the proportional increase in light intensity when there are 2 or more consecutive bases of the same type in the sequence
What is the name of the sequencing technology developed by Cambridge and was launched commercially in 2006 (it now has about a 70% market share)?
Illumina sequencing
Illumina sequencing is similar in principle to which other type of DNA sequencing method?
Sanger dideoxy sequencing
What is the key difference between Illumina and Sanger sequencing methods?
Illumina has reversible terminator sequencing which means the terminator can be removed
Name a type of third generation sequencing technology
PacBio single molecule rear-time (SMRT)
What process required in next generation sequencing technology does third generation sequencing technology bypass?
bypasses the need for DNA amplification by PCR
What is the desired advantage of using third generation sequencing technologies over other sequencing methods?
Designed to achieve longer read lengths from single molecules of DNA
Third generation sequencing allows for longer read lengths, what are the two key implications of this (compared to low read sequencing methods)?
Gets around problems with shorter reads such as assembly of repetitive DNA regions
and helps the analysis of a genome from low yield sources (e.g. if you wanted to genotype a single cell)
Name 3 databases that are publicly available to share and store information on genomes/genes.
National Centre for Biotechnology Information (NCBI)
European Bioinformatics Institute (EBI)
Universal Protein Resource (Uniprot)
Give an example of how it can be useful to sequence genes and have this information publicly available?
finding conserved genes in multiple organisms to provide clues on function
Describe an innovative application on the use of genome sequencing technologies.
Comparison of multiple genomes from different biopsies within a tumour to help identify shared mutations that are initial ‘driver’ mutations that could hopefully one day create personalised cancer treatment