L1-4: Overview of Systems Biology and Bioinformatics Flashcards
Every individual harbours ____ genetic variant sites.
4-5 million
Transcriptome
The full set of RNA molecules in one cell or a population of cells (i.e. expressed genes).
The aim of transcriptomic experiments is to identify ______.
differentially expressed genes
The Human Cell Atlas project aim
identify (based on transcriptome) and locate every cell in the human body
Proteomics can be distinguished into which four main aspects?
Sequence
Structural
Functional and interaction
Expression
Genomics
Study of the function and structure of a genome
Genome
The complete set of all genes, regulatory sequences and non-coding regions within an organism’s DNA
Contig
A set of overlapping DNA segments that together represent a consensus region of DNA.
Isolation of total genomic DNA - steps
- Mechanical disruption of cells/tissues (homogeniser, bead beater)
- Lysis of host cells (detergents such as SDS)
- Separation of DNA through enzymatic digestion of proteins, absorption to and release of the DNA from a chromatographic matrix (resin) and deproteinisation of the DNA solution with organic solvents (phenol / chloroform)
- Precipitation of DNA with ethanol or isopropanol
Construction of a shotgun library
- Collections of short segments of DNA generated by digestion of genomic DNA with restriction enzymes (representing the entire genome) are ligated into vector plasmids.
- Millions of different recombinant molecules are generated and these are propagated in bacteria or yeast.
Sanger Sequencing
When DNA binds dideoxynucleotides, they arrest DNA sequencing
The dideoxynucleotides for each of the four bases can each have a different fluorescent label so the 4 reactions can be run in the same tube.
The reaction is run on a polyacrylamide gel and fluorescence detected by an automated sequencing machine.
Next-generation WGS sequencing techniques
Illumina sequencing
Roche 454
Ion Torrent
Illumina Sequencing
100-150bp reads are used
Fragments are ligated to adapters and annealed to a slide. PCR is carried out and copies are separated into single strands.
The slide is flooded with nucleotides and DNA polymerase
An image is taken: in each read location there will be a fluorescent signal indicating the base that has been added
Terminators are removed, allowing the next base to be added. The process is repeated, adding one nucleotide at a time with imaging in between.
Roche 454 sequencing
- DNA is fragmented, adapters added, annealed to beads and amplified by PCR. Each bead is placed in a single well of a slide.
- The slide is flooded with one of the four nucleotides. The addition of each nucleotide releases a light signal.
- The NTP mix is washed away and the next NTP mix is added and the process is repeated, cycling through the four NTPs.
Ion Torrent: Proton/PGM Sequencing
- Ion Torrent does not make use of optical signals. The basis of ion torrent sequencing relies on the addition of a dNTP releasing a H+ ion.
- DNA is fragmented, adapters added and one molecule is placed on a bead and amplified by PCR. Each bead is placed in a single well of a slide. The slide is flooded with one of the four dNTPs.
- The pH is detected in each well. The release of a H+ ion will decrease the pH.
- The dNTPs are washed away and the process is repeated, cycling through the dNTPs.
Methods for long sequence reads
Nanopore technology
SMRT sequencing
Nanopore technology
- A protein nanopore is set in an electrically resistant polymer membrane.
- An ionic current is passed through the nanopore by setting a voltage across this membrane.
- If an analyte passes through the pore or near its aperture, this event creates a characteristic disruption in current
SMRT sequencing
Based on DNA replication:
A fluorescent label on the terminal phosphate of the dinucleotides can be detected when DNA polymerase incorporates the nucleotide into the DNA
The two most commonly used high-throughput methods of measuring the transcriptome are:
microarrays and RNA sequencing
Advantages of RNA-Seq over microarrays
Comprehensive; microarrays require known sequences and an annotated genome.
Microarrays only reveal information about ORFs.
RNA-seq covers entire genome
Detects novel transcripts
Identifies structural variations (gene fusions and alternative splicing)
Illumina, Roche454 and Iontorrent generate ____ bp reads
100
RNAseq workflow
- Library preparation: isolation of RNAs and generation of cDNA, selection of fragment size, and addition of linkers
- Illumina paired read sequencing
Experimental considerations of designing RNAseq experiments
Library construction:
choosing the population of RNA to use
Sequencing depth (required read number and coverage)
Number of technical and biological replicates