Genomics Flashcards
What does the tree of life assume ?
A monophyletic view of life
What does the tree of life highlight ?
The first divergence events
What is a synapomorphy ?
A characteristic or trait present in an ancestral species that is only shared by its descendants in that distinguishes a clade from other organisms.
What changed the traditional classification of prokaryotes being a single kingdom
Studies on methanobacterium and methanosarcina
What allowed the classification of the new taxon archaebacteria
Methanogens that are equidistant from eukaryotes and bacteria
What are the differences between archaebacteria and eubacteria ?
Their membranes are composed of glycerol-ether phospholipids.
They have a L-glycerol
Phospholipids consist of isoprenoid side chains with multiple side branches
What is a paralogous gene ?
Homologous gene that occurred due to duplication event
What is a homolog ?
Homologs denote genes that derive from the same ancestral sequence
What is the difference between orthologs and paralogs ?
Orthologsare corresponding genes in different lineages and are a result of speciation, whereasparalogsresult from a gene duplication.
What does reticulate evolution refer to ?
An evolutionary processes that allows some lineages to merge and produce new lineages. The evolutionary process cannot be modelled by trees
Who first described prokaryotes ?
Robert hooke
What are the types of first generation sequencing ?
Maxam-Gilbert Method
Sanger Dideoxy Method
What are the types of second generation sequencing ?
Illumina sequencing
Ion Torrent
What are the types of third generation sequencing ?
Pacific biosciences
Nanopore
What method can determine DNA molecules of up 500bp and relies on toxic chemicals ?
Maxam-Gilbert Method
What are the key steps in the maxim-gilbert method ?
- Attain single strand of DNA
- Label the DNA radioactively with 32 p to the 5’ end
- Separate DNA into 4 tubes and add chemicals that cleave specific nucleotides
- This gives different sized DNA strands
- Run gel electrophoresis on acrylamide gel
What is the key element of Sanger sequencing ?
2’,3’ dideoxynucleoside triphosphates, which lack the hydroxyl group at the 3’ position.
What do ddNTPs do ?
They terminate DNA synthesis
What are the key steps in the Sanger Dideoxy Method ?
- Denature DNA and add radioactive primer
- Seperate into 4 tubes and add ddNTP polymerase and dNTPs
- Run through denaturing gel
What does a dNTP do ?
Extends DNA strand
What are the 4 limitations of the initial Sanger dideoxy method ?
Cant run everything in one lane
Unable to read sequences at the top of gel
Cost
Low throughput
How are the limitations of Sanger dideoxy method over come ?
fluorescently labelled ddNTPs which can be read by a laser
How many bp does a single Sanger sequencing cover ?
Up to 1000
Why do we need to assemble sequences into Contigs ?
Genomes are too big
What is a contig ?
DNA sequence built up from a number of smaller overlapping sequences during a sequencing project.
What is scaffolding ?
is a technique used to link together a non-contiguous series of genomic sequences into a scaffold, consisting of sequences separated by gaps of known length.
What is coverage
the number of reads that include a given nucleotide in the reconstructed sequence
What is a read ?
each sequenced piece of a genome
What are the two types of Sanger sequencing ?
Hierarchal shotgun sequencing
Whole genome shotgun sequencing
What are the steps in whole genome sequencing ?
DNA extraction Library construction Random sequencing phase Gap closure phase Annotation
What do the 2nd generation sequencing methods have in common ?
- The DNA has to be fragmented
2. Clonal amplification (different PCR methods)
What is the problem with Illumina sequencing ?
Slow and produces short sequence lengths
What sequencing methods are synthesis based ?
Illumina
Ion torrent
What sequencing methods use the ion semiconductor approach ?
Oxford Nanopore
Ion torrent
Illumina
What does illumina sequencing rely on ?
Reversible terminators
What is a benefit of illumina sequencing and what was a major advancement ?
Don’t have to move DNA around
Paired end sequencing
What does illumina sequencing generate in the flow cell ?
millions of dense clusters of dsDNA
What are the key steps in illumina sequencing ?
- Fragment DNA and attach to surface of flow cell using adaptors which contains a primer
- Initiate bridge PCR with unlabelled NTPs
- When PCR complete, millions of clusters are formed, add 4 labelled reversible terminators, primers and DNA pol to begin sequencing.
- Excite clusters with a laser allowing fluorescents to be captured
- Remove reversible terminators and repeat cycle by adding more ddNTPs
- data aligned and compared to reference sequences
What is a key aspect of ion torrent sequencing ?
Exploits the fact that addition of a dNTP
to a DNA polymer releases an H+ion
What are the steps in ion torrent sequencing ?
- The input DNA is fragmented, Adaptors are added and one molecule is placed onto a bead.
- molecules are amplified on the bead by emulsion PCR
- Each bead is placed into a single well of a slide. The slide is flooded with a single species of dNTP, along with buffers and polymerase, one NTP at a time.
- The pH is detected in each of the wells, as each H+ion released will decrease the pH.
How is third generation sequencing characterised ?
The lack of DNA or RNA amplification in template library preparation.
What are the benefits of third generation sequencing ?
- Avoid polymerase chain reaction-introduced error and amplification bias.
- Real time measurements
- Much higher throughput than first and second generation techniques.
- Lower price per Mbp generated sequence
- Longer reads
What is a ZMW ?
zero-mode waveguide wells- creates an illuminated observation volume that is small enough to observe only a single nucleotidebeing incorporated by the DNA polymerase
What are the steps within Pacific bioscience sequencing ?
- The platform uses a DNA polymerase anchored to the bottom surface of a ZMW.
- Differentially labelled nucleotides enter the ZMW via diffusion and occupy the ‘detection volume’ .
- During an incorporation event, the labelled nucleotide is ‘held’ within the detection volume by the polymerase for tens of milliseconds.
- As each nucleotide is incorporated, the label located on the terminal phosphate is cleaved off and diffuses out of the ZMW
What are the benefits of Pacific bioscience sequencing ?
Very efficient: fewer expensive chemicals have to be used
Very sensitive
Long reads (10-15 Kbp)
What are the positive and negatives of nanopore sequencing ?
Strengths: cheap, fast, simple, large reads, scalable, …
Problem: rather high error rate
What is a Q score ?
measures the probability that a base is incorporated incorrectly (Higher Score indicates a smaller probability of error).
What is Phredd 33 ?
A method of quality encoding during genome assembly.
What are the 2 methods of assembling contigs ?
De novo= No scaffold sequence to guide alignment.
Read mapping=Reads are aligned to a reference sequence.
What are the 2 assembly algorithms ?
Greedy method- Incremental build. Builds using highest scoring overlap.
Graph based: Pre-processes the data to produce a graph structure of pairwise overlap info.
What type of graph deals with repetitive DNA regions and how does it work ?
de Bruijn- Split short reads into shorter uniform reads.
What is the K value ?
The length of the string
What does velvet do and what is it ?
A genome assembler that use de Bruijn graph based algorithm. Takes short reads and removes errors to produce high quality contigs.
What is the N50 statistic ?
The N50 value informs researcher that 50% of the genome is assembled in contigs larger than the N50 value.
What is genome annotating ?
Attaching biological meaningful info to genome sequences.
What is an ORF ?
Open reading frame- Part of the reading frame that contains no stop codons. UAA UGA UAG
What are the 3 methods for gene prediction ?
AB initio method- Uses info from genomic sequence such as GC content and ORFs
Database similarity - Uses public databases to identify sequences that are similar
Combiners: Use both techniques
What does hierarchical shotgun sequencing require ?
A genome map as preliminary data
What are the 9 steps in hierarchical shotgun sequencing?
- Fragmentation of genome
- Inserted into plasmid
- Plasmid Cloned in e.coli
- Investigate order of clones in genome using hybridisation, to give full genome map.
- Take each clones and fragment DNA
- Apply sequencing method
- Form contigs
- Merge contigs
What are the steps in bridge PCR ?
- initiated by the addition of unlabelled NTPs.
- DNA amplification occurs and then DNA folds and form a bridge due to hybridisation of adapters.
- Add polymerase and primers to make a complementary strand.
- Increase the temp to break bridge connection leaving two ssDNA, the forward and reverse strand.
- This form millions of clusters for illumina sequencing