midterm 2 cram Flashcards
gene transfer in bacteria and archaea
foreign DNA can enter a prokaryotic cell in 3 ways (TTC)
transformation: competent cells (can take up free DNA) incorporate free DNA into recipient cell and bring genetic change
transduction: DNA from environment captured by pili; one DNA strange usually degraded, other strand passes through cytoplasmic membrane and into cell via a multi-protein competence system
- bacteriophage infections happen here!
- virus DNA package into virions, just can bind cells and inject DNA
- in lytic pathway, phage DNA replicated, use host resources, then viruses lyse host cell and are released to infect new cell
- in lysogenic pathway, viral DNA integrated into host DNA (prophage). this can be induced triggering the lytic cycle
- temperate phages mean they can operate via lytic or lysogenic pathway
2 types of transduction:
transduction recap: virus (phage) transfers DNA from one cell to another
Generalized transduction
- lytic cycle, host cle DNA is accidentally pacakged into a viral particle
- DNA injected into new cell
Specialized transduction
- when a prophage is induced, DNA is excised from genome and packaged into phage particles
- DNA can then be injected into a new cell by that phage particle
conjugation
- HGT requiring cell-cellcontact
mediated by conjugative plasmids
donor cell uses a conjugative pilus to grab a recipient cell
specific DNA is replicated and transferred from donor to recipient using type 4 secretion system
genetic recombination
physical exchange of DNA between elements
homologous recombination is important
- important DNA repair mechanie used to repair double strand breaks
- HGT
- foreign DNA with homology to a region of host chromosome can be inserted into genome in place of - or in addition to - the native DNA
sequence
- HR is also important for genome rearrangements – deletions,
duplications, inversions of segments of genomic DNA
Transposable elements
mobile genetic elements found in almost all
species. Contain transposase gene (and often extra DNA too) flanked
by inverted repeats.
Transposase enzymes are able to:
-Recognize inverted repeats of DNA sequences
-Cleave that DNA to free “transposable element”
-Cleave another DNA (e.g. chromosomal DNA)
-Insert the transposable element into that DNA
-Wow! That’s one impressive enzyme!!!
-This process called “transposition”.
- Many transposable elements are conservative (cut and paste) – move
from one place to another. Others work via a replicative mechanism –
transposon remains and a copy is produced & inserted elsewhere
Evolution via horizontal gene transfer
Much acquired DNA will not be evolutionarily useful and will ultimately
be lost. For example:
o Transposon or recombination-mediated processes
o Random processes/errors during DNA replication or DNA repair
o Genes that provide a selective advantage will be maintained and can
outcompete parental strains that lack this new DNA
Gene names and protein names
Gene names, by convention are 4 letters. First 3 letters describe function
– 4th letter designates a specific gene.
Gene names are italicized – first three letters lower case, end with upper
case letter (btuC)
Protein names are the same, but start with an upper-case letter and are
NOT italicized (BtuC).
mutations
Spontaneous mutations: Relies on natural mutations that arise by
random processes (see last lecture!). Need large numbers of bacteria
& powerful methods to isolate mutants of interest.
Induced mutations: Expose your organism to agents that increase
mutation rate. E.g. UV light, or various chemicals that interact with
DNA.
Transposon insertion mutations: Introduce a transposon that
inserts randomly into the genome of your organism (e.g. by
transformation or conjugation). Generally disrupts whatever gene it
inserts into. Transposon carries antibiotic resistance gene to isolate
bacteria with a Tn insertion. (See next slide)
Transposon mutagenesis
Isolating interesting mutants
In some instances, mutants of interest can be isolated by selection –
mutant grows, parent doesn’t
E.g. antibiotic
resistance.
Selection is highly efficient – can identify single mutant with a desired
phenotype out of millions (or more) of cells
auxotroph mutants. Mutants that require a specific nutrient to
grow.
Transposon INsertion site sequencing (INseq)
Make a large library of transposon (Tn) mutants - lots of different
bacteria, each with one random Tn insertion.
Sequence the Tn insertion sites - the DNA immediately beside when
the transposon landed. Tells you frequency of each Tn mutant in your
library. Millions of DNA sequence reads (input population)
Expose mutant library to some challenge. Can be anything. E.g. grow
in a medium that lacks a key nutrient. Sequence insertion sites again
(output population)
Comparing input/output populations tells you which genes important
for surviving that challenge (e.g. enzymes that make the key nutrient).
Transposon Insertion Sequencing (INSeq) is a high-throughput technique used to identify essential genes and gene functions in bacterial genomes. This method combines transposon mutagenesis (where transposons are inserted randomly across the genome) with next-generation sequencing to analyze the location and frequency of transposon insertions across a bacterial population. Here’s an overview of how it works and why it’s useful:
Key Steps in INSeq:
Transposon Mutagenesis:
A large population of bacterial cells is generated, each containing a transposon inserted at a random location in the genome.
Transposons disrupt genes upon insertion, meaning that if a transposon inserts into an essential gene, the cell may not survive, reducing the frequency of insertions in essential genes in the final dataset.
Growth and Selection:
The population of transposon-mutant bacteria is grown under specific conditions (e.g., in the presence of a stressor, nutrient limitation, or antibiotic) to identify genes important for survival or adaptation under those conditions.
Cells with transposon insertions in non-essential genes will survive, while insertions in genes essential for the selected condition will result in the loss of those cells.
DNA Extraction and Sequencing:
After selection, DNA is extracted from the surviving bacteria, and regions flanking the transposon insertion sites are amplified and sequenced.
By sequencing these regions, researchers determine the precise locations of transposon insertions across the genome.
Data Analysis:
Sequencing data are analyzed to identify “gaps” in insertion sites, which may indicate essential genes (regions with few or no insertions) or non-essential genes (regions with high transposon insertion frequencies).
DNA sequencing – First complete genome
Craig Venter was a major name in DNA
sequencing. Used “Shotgun sequencing”
– sequence random bits of DNA, let
computers figure out how it all fits
together.
Faster/more efficient than more
structured approach used originally
DNA sequencing - Sanger
Developed by Fredrick Sanger in 1970s – Nobel prize (one of his two!)
o Based on DNA polymerase building a complementary strand using: (i)
mostly normal dNTPs and (ii) rare special dNTPs that lack a 3’OH
and therefore cannot be elongated further
o Special “ddNTPs” each labelled a
different way (different fluorophores)
o Build DNAs of different lengths, each
terminated with a labelled ddNTP
o Determine sequence based on identity of
terminating residues (e.g. – 26 nt
sequence terminated with a “T”, 27 nt
sequence terminated with a “G”, etc.)
Sanger sequencing, also known as chain-termination sequencing, is a method developed by Frederick Sanger in 1977 to determine the nucleotide sequence of DNA. This technique was the gold standard for DNA sequencing for decades and remains widely used for smaller-scale sequencing tasks, such as verifying specific genes or cloning.
How Sanger Sequencing Works
DNA Replication Setup:
Sanger sequencing relies on the principles of DNA replication. The process begins by denaturing (unwinding) the DNA double strand and then synthesizing a complementary strand using a DNA polymerase enzyme.
A single-stranded DNA template, a primer, DNA polymerase, and four standard nucleotides (dATP, dTTP, dCTP, and dGTP) are required, as well as a small proportion of modified nucleotides called dideoxynucleotides (ddNTPs).
Dideoxynucleotides (ddNTPs):
The key to Sanger sequencing is the use of ddNTPs. These nucleotides lack a hydroxyl group at the 3’ carbon, meaning they cannot form a phosphodiester bond with the next nucleotide. When a ddNTP is incorporated into the growing DNA chain, it terminates synthesis at that position.
Each of the four ddNTPs (ddATP, ddTTP, ddCTP, ddGTP) is labeled with a different fluorescent dye, allowing the termination points to be identified by color.
Chain Termination and Fragment Generation:
During DNA synthesis, the polymerase randomly incorporates either a standard nucleotide or a ddNTP. This random incorporation results in a collection of DNA fragments of varying lengths, each ending at a point where a ddNTP was added.
The sequence length of each fragment corresponds to the position of that nucleotide in the template.
Separation and Detection:
The fragments are then separated by capillary electrophoresis, where shorter fragments move faster and longer fragments move slower through a gel or capillary.
A laser detects the fluorescently labeled ddNTPs at the end of each fragment, and the sequence of colors detected corresponds to the DNA sequence.
Data Output:
The data is displayed as a chromatogram, where each peak represents a nucleotide in the DNA sequence. By reading the chromatogram, researchers can determine the precise sequence of the DNA.
What genes are present/absent & the sequences of each gene
o Metabolic capabilities of an organism
o Virulence genes, antibiotic resistance genes, etc
o Unusual mutations that account for unusual phenotypes
o Discover new genes that might be of medical/industrial interest
o etc…
Provides DNA blueprint required for many studies/analyses
o Genetics approaches (e.g. making mutations to genes)
o Transcriptomics, qPCR, etc – studies of RNA expression
o Proteomics – studies of proteins
o INseq (Tn-seq)
Metagenomics
Metagenomics is the study of the complete genetic content of an environmental sample
This approach allows scientists to analyze the genetic diversity and functional potential of entire microbial communities, including bacteria, viruses, fungi, and other microorganisms, in complex environments like soil, ocean water, human gut, or even air.
Key Aspects of Metagenomics
Environmental Sampling:
Instead of isolating and culturing individual organisms, metagenomics collects all DNA from an environmental sample.
DNA is extracted directly from the sample, capturing genetic material from all the organisms present.
DNA Sequencing:
Extracted DNA is sequenced using high-throughput sequencing methods (e.g., Illumina, PacBio) to obtain millions of DNA fragments.
Sequencing strategies include shotgun metagenomics (sequencing all DNA present) or amplicon sequencing (targeting specific genes, like the 16S rRNA gene in bacteria, to identify species diversity).
Bioinformatics Analysis:
Powerful bioinformatics tools are used to assemble the DNA sequences, identify organisms (taxonomic analysis), and predict gene functions.
Metagenomic data analysis can provide insights into microbial community structure, gene abundance, metabolic pathways, and potential interactions between organisms.
Applications of Metagenomics
Environmental Microbiology:
Metagenomics is widely used to study microbial diversity and ecosystem functions in natural environments, such as oceans, soil, and extreme habitats.
It helps in understanding nutrient cycling, decomposition, and ecosystem health.
Human Health:
In the human microbiome, metagenomics reveals the composition of microbial communities in different body sites, such as the gut, skin, and mouth, and their roles in health and disease.
Metagenomics has linked certain microbial profiles to conditions like obesity, inflammatory diseases, and mental health issues.
Biotechnology and Industry:
By identifying novel enzymes and metabolic pathways, metagenomics has enabled the discovery of new biocatalysts for applications in drug development, biofuel production, and agriculture.
Environmental cleanup efforts, like bioremediation, benefit from metagenomics, as it helps identify microbes capable of degrading pollutants.
Antibiotic Resistance:
Metagenomic studies can track the spread of antibiotic resistance genes in different environments, helping to monitor and manage public health risks.
Advantages of Metagenomics
Culture-Independent: It enables the study of microorganisms that cannot be easily cultured in the lab, which is the vast majority of microbes.
Comprehensive View: Provides a holistic understanding of microbial ecosystems, capturing all genetic material, including viruses and rare species.
Functional Insight: Metagenomics reveals not only what organisms are present but also the genes and metabolic functions they contribute to their environment.
Transcriptomics: RNA-seq
RNA can be converted to DNA using a process called reverse
transcription
Why Convert RNA to cDNA?:
DNA is more stable than RNA, making it easier to handle and sequence.
The reverse transcription step enables researchers to study the full range of RNA in the cell, including both coding and non-coding RNA molecules, by leveraging sequencing technologies optimized for DNA.
Reverse transcription is, therefore, a critical step in RNA-seq, enabling the study of gene expression and transcript structure from the RNA present in cells at any given time.
Proteomics
mass spectrometry!!!! to identify proteins/protein levels
Proteomics seeks to identify, quantify, and characterize proteins to understand their functions, interactions, and roles in cellular processes.
Transcription initiation:
Promoters
Transcriptional initiation is guided by DNA sequences called promoters - DNA sequences bound by factors that promote
transcriptional initiation.
Reside upstream of (before) genes.
Whether or not a sequence acts as a promotor & if promoter is active
is dictated by binding of sigma factors (next slide) & regulatory
proteins (future lectures) to the promoter region.
Transcriptional initiation:
Sigma factors
Transcription uses an enzyme called RNA polymerase
A special subunit of RNA polymerase called a sigma factor binds DNA
as an essential step in initiating transcription
Bacteria encode multiple different sigma factors that are produced under
different conditions. They recognize different sequences (promoters) –
see next slide.
The housekeeping (most commonly used/most important) sigma factor is called or �70 (or RpoD) – it recognizes two sequences upstream of the transcriptional start site.
Transcription in bacteria: A bit more detail
RNA polymerase core enzyme made up of 5
subunits: ⍺ (2 copies), β, β’, ω. Holoenzyme
also includes � (sigma) subunit (sigma factor)
o Sigma factor binds promoter region, then
dissociates from core enzyme
o Core enzyme unwinds DNA to expose template
- forms transcription bubble
o Using NTPs (ATP, CTP, GTP, UTP) as
substrates and the template strand as guide, the RNA chain is built one nucleotide at a time
o Ultimately, RNA polymerase will encounter a transcriptional terminator (see next slide) and will dissociate from the template & release the RNA
Termination of transcription
Transcription will (generally) continue until RNA polymerase (RNAP)
encounters a transcriptional terminator. RNAP then dissociates from
DNA, stops making RNA & releases transcript
Intrinsic (rho-independent) terminators form when RNA hairpin structures form, followed by a string of “U” residues. U residues act as pause signal for RNAP – formation of hairpin forces RNAP off template.
Rho-dependent terminators: A protein called Rho binds RNA as it is being transcribed and causes RNA polymerase to dissociate after it encounters certain sequences
Types of transcripts
There are 3 major classes of RNAs
Messenger RNA (mRNA) – converted to protein via translation
Transfer RNA (tRNA) – functional RNAs, used in translation process
Ribosomal RNA (rRNA) - functional RNAs, used in translation process
mRNAs contain both Open reading frames (ORFs) and untranslated regions (UTRs). ORFs are translated to protein, UTRs are parts of the mRNA transcript that are not translated into protein
mRNAs that encode multiple ORFs are polycistronic. Such genes are arranged in an operon. Genes in an operon are cotranscribed.
In a simple mRNA encoding a single open reading frame (ORF):
5’UTR – everything from first transcribed residue (+1) through the start codon of gene. Contains ribosome binding site (RBS), more?
ORF – Start codon (e.g. AUG) through stop codon (e.g. UAA)
3’ UTR – everything from the stop codon of the gene through the final transcribed residue. Often contains transcriptional terminator sequences
Some ways transcription is different in eukaryotes
Location:
Eukaryotes: Transcription occurs in the nucleus, where DNA is separated from the cytoplasm by a nuclear membrane. After transcription, the mRNA undergoes processing and is transported out of the nucleus for translation in the cytoplasm.
Prokaryotes: Transcription occurs in the cytoplasm since they lack a nucleus, allowing transcription and translation to occur simultaneously.
RNA Polymerases:
Eukaryotes: Have three main RNA polymerases (RNA Polymerase I, II, and III), each dedicated to transcribing different types of RNA:
RNA Polymerase I synthesizes rRNA (ribosomal RNA).
RNA Polymerase II synthesizes mRNA (messenger RNA).
RNA Polymerase III synthesizes tRNA (transfer RNA) and other small RNAs.
Prokaryotes: Use a single RNA polymerase to transcribe all types of RNA.
Promoters and Transcription Factors:
Eukaryotes: Promoters are complex and often include a TATA box along with other regulatory sequences. Transcription initiation requires general transcription factors (TFs)!!!!! and specific activator proteins !!!!! that help RNA polymerase bind to the promoter.
Prokaryotes: Promoters are simpler, typically with -10 and -35 regions recognized by sigma factors!!!!!. These sigma factors are the main initiation proteins, binding RNA polymerase to the promoter directly.
mRNA Processing:
Eukaryotes: The primary mRNA transcript (pre-mRNA) undergoes extensive processing:
5’ Capping: A modified guanine nucleotide is added to the 5’ end for stability and recognition.
3’ Polyadenylation: A poly-A tail is added to the 3’ end to protect mRNA from degradation.
Splicing: Introns (non-coding sequences) are removed, and exons (coding sequences) are joined together to produce a mature mRNA.
Prokaryotes: mRNA is typically not processed, as prokaryotic genes lack introns. Transcription and translation can occur simultaneously.
Regulation Complexity:
Eukaryotes: Gene expression regulation is complex and occurs at multiple levels (epigenetic, transcriptional, post-transcriptional). Transcription can be influenced by DNA packaging (chromatin structure) and modifications like histone acetylation and methylation.
Prokaryotes: Regulation is generally simpler, primarily at the transcriptional level, often via operons (gene clusters with a single promoter). Gene expression is regulated by repressors, activators, and sigma factors.
Termination:
Eukaryotes: Termination is less defined; transcription may continue well past the end of the gene, and the transcript is cleaved at a specific site, followed by polyadenylation.
Prokaryotes: Termination is more straightforward and occurs by two main mechanisms: Rho-dependent and Rho-independent termination, where the RNA forms a structure that signals transcription to stop.
mRNA Longevity:
Eukaryotes: mRNA is generally more stable, with some transcripts lasting hours or even days, thanks to 5’ caps and poly-A tails.
Prokaryotes: mRNA is typically less stable and degrades rapidly (within minutes), allowing prokaryotes to quickly adjust gene expression in response to environmental changes.
Transcription in Archaea
RNA Polymerase:
Archaea: Have a single RNA polymerase that closely resembles eukaryotic RNA polymerase II in structure and function, containing multiple subunits.
Similarity to Eukaryotes: Archaeal RNA polymerase is more complex than bacterial RNA polymerase and includes homologous subunits to those found in eukaryotic RNA polymerase II.
Similarity to Bacteria: Like bacteria, Archaea have only one type of RNA polymerase responsible for transcribing all types of RNA (mRNA, tRNA, and rRNA).
2. Promoters and Transcription Factors:
Promoter Structure: Archaeal promoters often contain TATA boxes and BRE (B recognition element) sequences, similar to eukaryotes.
Transcription Factors: Instead of bacterial sigma factors, Archaea use transcription factors TBP (TATA-binding protein) and TFB (transcription factor B), which are homologous to the eukaryotic transcription factors involved in recruiting RNA polymerase to the promoter.
Similarity to Eukaryotes: These transcription factors bind to the promoter in a manner similar to eukaryotic transcription initiation, where TBP binds to the TATA box and TFB binds to the BRE sequence.
3. mRNA Processing:
Lack of Extensive Processing: Archaeal mRNA generally does not undergo extensive processing like eukaryotic mRNA does. There is usually no 5’ capping, polyadenylation, or splicing since Archaea lack introns in most genes.
Exception: Some Archaea have introns in tRNA and rRNA genes, which are spliced out, but this is far less common than in eukaryotes.
4. Transcription Regulation:
Regulatory Proteins: Archaea use regulatory proteins, such as repressors and activators, similar to bacteria, to control gene expression.
Similar to Bacteria: Many regulatory mechanisms in Archaea resemble bacterial mechanisms, with transcriptional repressors and activators binding directly to DNA to either inhibit or enhance transcription.
Unique to Archaea: Some Archaea have regulatory proteins that are specific to their domain, with functions adapted to extreme environments (e.g., high temperatures, acidic conditions).
5. Transcription Termination:
Mechanisms: Archaeal transcription termination is less well understood but may involve sequences or structures similar to those in bacteria. Evidence suggests that some Archaea have termination sequences resembling the Rho-independent termination of bacteria.
Similarity to Eukaryotes and Bacteria: Archaea do not have a distinct Rho factor like bacteria; termination often relies on intrinsic mechanisms encoded in the DNA sequence.