TB7 Flashcards
What did Gregor Mendel discover in 1866?
Transmission of characteristics (pea plants)
What did Friedrich Miescher identify in 1869?
Nuclein (DNA)
What did Albrecht Kossel show in 1881?
That nuclein is composed of nitrogen bases and named it DNA
What did Boveri and Sutton show in the 1900s?
Chromosomes bear the material of heredity (the chromosome theory of inheritance)
What did Oswald Avery show in 1944?
DNA is the ‘transforming principle’ - DNA from one bacteria could confer characteristics on another strain
What did Rosalind Franklin generate in 1951?
Diffraction patterns of DNA to show its helical form
What did Watson and Crick propose in 1951?
A model for DNA structure based on Rosalind Franklin’s data
What did Robert Holley do in 1964?
Used RNase to partially fragment RNA and coupled this to analytical techniques to determine the sequence of yeast Ala-tRNA
What did Fred Sanger develop in 1965?
2D fractionation methods that allowed larger and more complex fragments of RNAs to be analysed and sequenced
Compare bench-top sequencing platforms with facility-based
Bench-top (NextSeq500)
- ~400 million reads per run
- ChIP-seq and RNA-seq
- Reads up to 300bp
Bench-top (NextSeq2000)
- ~1.2 billion reads per run
- ChIP/RNA-seq and small genomes
- Reads up to 300bp
Facility (NextSeq6000)
- ~2 billion reads per run
- multiplexed samples
- large genomes
- Reads up to 500bp
Define a contig
A set of overlapping DNA segments that together represent a consensus region of DNA
Describe a GTF file
Holds information about the structure of genes. The coordinates of mapped reads are projected onto a GTF file of the feature of interest
Describe a feature count
Counts the reads assigned to a gene in a stranded manner, producing 3 files: gene length, counts and summary files.
Describe a pseudoalignment
Measures compatibility with a transcript rather than matching each nucleotide to the target sequence
What is a volcano plot?
A type of scatterplot that allows identification of genes that have changed with statistical significance.
What is a heatmap?
Shows RNAseq data in a grid where each row represents a gene and each column represents a sample. The colour and intensity of the boxes represents changes of gene expression.
Define peak calling
A computational method used to identify areas in a genome that have been enriched with aligned reads as a consequence of ChIP-Seq.
Why was the human genome published in 2001 incomplete? How did it get completed?
It was missing ~8% of the repetitive sequences as the BAC DNA libraries used don’t tend to represent repeat regions well. The T2T project changed this by using a combination of PacBio, Nanopore and Illumina.
What did the T2T project identify?
Centromeric regions; telomeres; rRNA repeats; entire short arms of 5 human chromosomes; 3604 new genes.
Describe sequencing-by-synthesis
Ray Wu and Dale Kaiser used DNA polymerase to add radiolabeled bases onto 3’ overhangs in linear lambda phage genomes and used analytical biochemistry to deduce sequence. Ray Wu later used synthetic oligonucleotides to guide where he could prime incorporation of nucleotides to allow him to focus on sequencing specific regions of DNA.
Describe automated Sanger sequencing
- PCR with fluorescent, chain-terminating ddNTPs. All ddNTPs are mixed in a single reaction, and each of the four dNTPs has a unique fluorescent label.
- Size separation by gel electrophoresis
- Laser excitation and detection by sequencing machine
Describe 454 pyrosequencing
- Ds-DNA is broken up into fragments and adaptors are added.
- Tiny resin beads are added with DNA sequences on them complementary to sequences on the adaptors, allowing the DNA fragments to bind to the beads
- When the fragments attach to the beads, the strands separate and become ssDNA
- The beads are emulsified and PCR reagents are added to form water-in-oil microreactors. Clonal amplification occurs inside the microreactors which can be broken to enrich for DNA-positive beads.
- Remaining beads are put into wells on a sequencing plate (one bead per well) along with DNA pol and a primer
- The pol and primer attach to the DNA and dNTPs are added to the wells in waves of one base at a time
- When a dNTP is incorporated, light is given out as PPi is released that’s converted into ATP via luciferase. ATP converts luciferin to oxyluciferin that emits photons.
Describe Illumina sequencing
A flow cell is coated with 2 types of oligos, complementary to the 2 adaptos on the fragment strand, respectively. Once the fragment strand is added to the flow cell, it hybridizes to one of the oligos on the cell surface. A polymerase then moves along the strand, creating its complementary DNA strand. The ds-DNA is denatured and the original strand is washed away. The remaining reverse strand folds over and its adaptor region hybridizes to the second oligo on the flow cell, forming a bridge. Pol attaches and forms a ds-bridge. This bridge is denatured, resulting in two ss-DNA copies, anchored to the flow cell. This process is then repeated forming localized clusters on the flow cell.
Sequencing is done by flowing fluorescently-labeled nts onto the flow cell, one at a time in an iterative process.
Describe PacBio SMRT sequencing
PacBio uses a SMRTbell library format in which DNA fragments are capped on both ends with ligated hairpin adaptors, where the sequencing primers attach. This creates a circular template for the polymerase.
The SMRT cell contains millions of tiny wells called zero-mode waveguides and one SMRTbell will go into each of these. As the polymerase incorporate nucleotides, light is emitted.
Describe Oxford nanopore sequencing
The biological or solid-state membrane, where the nanopore is found, is surrounded by an electrolyte solution. The membrane splits in the solution into two chambers. A bias voltage is applied across the membrane inducing an electric field.
When DNA or a protein enters the nanopore, it occupies a volume that partially restricts the flow of ions, observed as a drop in current. Different bases can be identified by their characteristic current drop.
Describe polysome profiling
Separates RNA molecules by centrifugation into fractions which are compared by RNA seq. RNAs more highly expressed in the polysome fractions are presumed to be more actively transcribed (more ribosomes bound).
Describe ribosome footprinting
RNase is used to digest exposed RNA while leaving ribosome-protected RNA undigested. Sequencing of the protected RNA reveals both the density and location of ribosomes. This can be modified to select and enrich for initiating ribosomes over elongating ribosomes e.g., QTI-seq and TCP-seq
Describe PARS-seq
Parallel samples are cleaved in vitro with either a ds-specific or ss-specific RNase. The RNA that remains after digestion is converted to cDNA and sequenced. Overlay and comparison of RNAseq data allows structure to be inferred.
Describe SHAPE-seq
SHAPE-Seq provides structural information about RNA. In this method, a unique barcode is first added to the 3’ end of RNA, and the RNA is allowed to fold under pre-established in vitro conditions. The barcoded and folded RNA is treated with a SHAPE reagent, 1-methyl-7-nitroisatoic anhydride (1M7), which blocks RT. The RNA is reverse-transcribed to cDNA. Deep sequencing of the cDNA provides single-nucleotide sequence information for the positions occupied by 1M7. The structural information of the RNA can then be deduced.
Describe SPLASH-seq
Interacting RNAs are cross-linked with biotinylated psoralen, which are enriched via streptavidin pull-down. Proximity ligation joins the free ends of the interacting RNAs. Further fragmentation is followed b RNA adaptor ligation and circularization to prepare an RNA-seq library. This shows intramolecular and intermolecular interactions.
Describe run-on assays (e.g., GRO-seq)
Label RNA by adding a time-limited pulse of modified nucleotides into cell media. e.g., GRO-seq uses BrU. After incorporation, nascent-RNA strands are enriched by IP with antibodies specific to BrU.
Describe mNET-seq
Pol-II associated RNAs are pulled down after chromatin digestion. During digestion, the nascent RNA is protected by its Pol-II footprint.
Describe SLAM-seq
Similar to a run-on by uses the nucleotide analogue 4sU. Alkylation of 4sU after RNA extraction prompts misincorporation of G nucleotides during RT, allowing 4sU incorporation sites to be directly determined by mutational analysis.
Describe ChIP-seq
- Proteins are fixed to the DNA
- Fragment
- IP with antibodies
- Crosslink reversal
- Adaptor ligation for library prep
Describe MNase sequencing
This technique relies on the use of the non-specific endo-exonuclease micrococcal nuclease, derived from bacteria, to bind and cleave protein-unbound regions of DNA. If DNA is bound, it remains undigested so can be sequenced using NGS.
Describe ATAC-sequencing
The hyperactive mutant Tn5 transposase cleaves and tags dsDNA with sequencing adaptors. The tagged DNA fragments are then purified, PCR-amplified, and sequenced using NGS. The number of reads correlates with how open the chromatin is.
Describe DNase-sequencing
DNA is treated with DNase I followed by DNA extraction and sequencing. DNA bound by protein is protected from digestion. NGS provides accurate representation of the location of regulatory proteins in the genome.
Describe bisulfite sequencing
Treatment of DNA with bisulfite converts C to U, but leaves 5meC unaffected. NGS can then show methylation status and identify CpG islands.
Describe Hi-C
DNA-protein complexes are crosslinked and fragmented. Loci that interact are re-ligated via proximity ligation whilst unligated fragments are removed. Pull-down and PCR preps for libraries.
The relative abundance of these products is correlated to the probability that the respective chromatin fragments interact in 3D space across the cell.
Hi-C enables ‘all-vs-all’ profiling by labeling fragments with biotin for more efficient purification.
Describe 4sU-seq
4sU is immediately taken up by cells, phosphorylated, and incorporated into newly transcribed RNA. This RNA can be tracked and isolated, or used to study the half-life and degradation of RNA.
Describe TT-seq
A variant of 4sU-seq that combines a 4sU labeling pulse with RNA fragmentation. The labeled, newly-synthesized RNA fragments are purified and sequenced, resulting in less contaminating non-labeled RNA.
Describe Net-seq
Detects nascent RNA through capture of 3’RNA. The RNAPII elongation complex is IP’d, and RNA is extracted and RT’d to cDNA. NGS of cDNA allows for 3’end sequencing of nascent RNA to map transcripts.
Describe Pro-seq
A run-on reaction is carried out with biotin-NTPs. Incorporation of these halts elongation and the RNA can be extracted via streptavidin pull-down. Adaptors are added, followed by RT and amplification for NGS.
Describe GeoMx spatial profiler
- tissue is frozen and applied with immunofluorescent biomarkers and probes, or antibodies linked to photocleavable tags.
- region of interest is selected
- oligo collection via photocleavable tags
- hybridization to optical fluorescent barcodes and these are counted using the nCounter system
- data output and analysis
Describe nanostring counting
Automated platform based on fluorescently labelled reporter probes which can detect single molecule and count hundreds of molecules in a single reaction. WIthout the biases of PCR amplification, it can be used with poor quality RNA samples.
Complexes are hybridized to probes, bound to a cartridge and aligned to form ‘barcodes’.
Describe a nuclear run-on
Stall RNAP and incorporate labeled nucleotides before letting RNAp continue. Isolate labeled RNA and hybridize to target antisense sequences that represent regions in loci of interest
Describe microarrays
Have immobilized cDNA strands on a glass slide. Gene-specific cDNA strands are spotted on the array and each of these spots represents a specific gene.
For DGE, RNA isolated from 2 different conditions are both RT’d and labeled with a different fluorophore. The labeled cDNAs are then hybridized to the probes on the array.
Explain the concept of ‘DNA combing’
These approaches use sequential pulse-labeling and stretching of DNA on glass slides for imaging to quantify features of replication
Describe repli-seq
Cells are pulsed with BrU to label ongoing replication. FACS, based on replication stage, were used to separate cells and generate libraries of DNA fragments. DNA that had been replicated at defined times of the cell cycle can be located using IP and massively-parallel sequencing.
Describe OK-seq
Can determine where replication occurs:
1. EdU pulse
2. Centrifuge for Okozaki fragments
3. ‘Click’ biotin, isolate, ligate adaptors and sequence
Inflects between fragments on both strands indicates where indication occurs
Describe SCAR-seq
Similar to OK-seq but could separate old and new nucleosomes based on methylation state instead of centrifuging. Can then look at the DNA in either the old or new categories to see if there’s bias towards the leading or lagging strand.
Describe chOR-seq
Compare a nascent and mature sample to study chromatin kinetics.
Describe single-cell RNA seq
Mechanical and/or enzymatic digestion of tissue sample into single cells. Enrich samples via FACs, isolating single cells via microfluidity approaches. Cells are encapsulated with microparticles in droplets e.g., SMARTseq. Cells are rapidly lysed and mRNA is captured and barcoded before the droplets are broken and RT’d for library prep.
Describe MeRIP/m6A sequencing
RNA is fragmented before an antibody is used to IP m6A. A pull-down is used to enrich for m6A sequences for sequencing.