TB7 Flashcards
What did Gregor Mendel discover in 1866?
Transmission of characteristics (pea plants)
What did Friedrich Miescher identify in 1869?
Nuclein (DNA)
What did Albrecht Kossel show in 1881?
That nuclein is composed of nitrogen bases and named it DNA
What did Boveri and Sutton show in the 1900s?
Chromosomes bear the material of heredity (the chromosome theory of inheritance)
What did Oswald Avery show in 1944?
DNA is the ‘transforming principle’ - DNA from one bacteria could confer characteristics on another strain
What did Rosalind Franklin generate in 1951?
Diffraction patterns of DNA to show its helical form
What did Watson and Crick propose in 1951?
A model for DNA structure based on Rosalind Franklin’s data
What did Robert Holley do in 1964?
Used RNase to partially fragment RNA and coupled this to analytical techniques to determine the sequence of yeast Ala-tRNA
What did Fred Sanger develop in 1965?
2D fractionation methods that allowed larger and more complex fragments of RNAs to be analysed and sequenced
Compare bench-top sequencing platforms with facility-based
Bench-top (NextSeq500)
- ~400 million reads per run
- ChIP-seq and RNA-seq
- Reads up to 300bp
Bench-top (NextSeq2000)
- ~1.2 billion reads per run
- ChIP/RNA-seq and small genomes
- Reads up to 300bp
Facility (NextSeq6000)
- ~2 billion reads per run
- multiplexed samples
- large genomes
- Reads up to 500bp
Define a contig
A set of overlapping DNA segments that together represent a consensus region of DNA
Describe a GTF file
Holds information about the structure of genes. The coordinates of mapped reads are projected onto a GTF file of the feature of interest
Describe a feature count
Counts the reads assigned to a gene in a stranded manner, producing 3 files: gene length, counts and summary files.
Describe a pseudoalignment
Measures compatibility with a transcript rather than matching each nucleotide to the target sequence
What is a volcano plot?
A type of scatterplot that allows identification of genes that have changed with statistical significance.
What is a heatmap?
Shows RNAseq data in a grid where each row represents a gene and each column represents a sample. The colour and intensity of the boxes represents changes of gene expression.
Define peak calling
A computational method used to identify areas in a genome that have been enriched with aligned reads as a consequence of ChIP-Seq.
Why was the human genome published in 2001 incomplete? How did it get completed?
It was missing ~8% of the repetitive sequences as the BAC DNA libraries used don’t tend to represent repeat regions well. The T2T project changed this by using a combination of PacBio, Nanopore and Illumina.
What did the T2T project identify?
Centromeric regions; telomeres; rRNA repeats; entire short arms of 5 human chromosomes; 3604 new genes.
Describe sequencing-by-synthesis
Ray Wu and Dale Kaiser used DNA polymerase to add radiolabeled bases onto 3’ overhangs in linear lambda phage genomes and used analytical biochemistry to deduce sequence. Ray Wu later used synthetic oligonucleotides to guide where he could prime incorporation of nucleotides to allow him to focus on sequencing specific regions of DNA.
Describe automated Sanger sequencing
- PCR with fluorescent, chain-terminating ddNTPs. All ddNTPs are mixed in a single reaction, and each of the four dNTPs has a unique fluorescent label.
- Size separation by gel electrophoresis
- Laser excitation and detection by sequencing machine
Describe 454 pyrosequencing
- Ds-DNA is broken up into fragments and adaptors are added.
- Tiny resin beads are added with DNA sequences on them complementary to sequences on the adaptors, allowing the DNA fragments to bind to the beads
- When the fragments attach to the beads, the strands separate and become ssDNA
- The beads are emulsified and PCR reagents are added to form water-in-oil microreactors. Clonal amplification occurs inside the microreactors which can be broken to enrich for DNA-positive beads.
- Remaining beads are put into wells on a sequencing plate (one bead per well) along with DNA pol and a primer
- The pol and primer attach to the DNA and dNTPs are added to the wells in waves of one base at a time
- When a dNTP is incorporated, light is given out as PPi is released that’s converted into ATP via luciferase. ATP converts luciferin to oxyluciferin that emits photons.
Describe Illumina sequencing
A flow cell is coated with 2 types of oligos, complementary to the 2 adaptos on the fragment strand, respectively. Once the fragment strand is added to the flow cell, it hybridizes to one of the oligos on the cell surface. A polymerase then moves along the strand, creating its complementary DNA strand. The ds-DNA is denatured and the original strand is washed away. The remaining reverse strand folds over and its adaptor region hybridizes to the second oligo on the flow cell, forming a bridge. Pol attaches and forms a ds-bridge. This bridge is denatured, resulting in two ss-DNA copies, anchored to the flow cell. This process is then repeated forming localized clusters on the flow cell.
Sequencing is done by flowing fluorescently-labeled nts onto the flow cell, one at a time in an iterative process.
Describe PacBio SMRT sequencing
PacBio uses a SMRTbell library format in which DNA fragments are capped on both ends with ligated hairpin adaptors, where the sequencing primers attach. This creates a circular template for the polymerase.
The SMRT cell contains millions of tiny wells called zero-mode waveguides and one SMRTbell will go into each of these. As the polymerase incorporate nucleotides, light is emitted.