Midterm 2 Flashcards
How were bacteria originally classified on Haekel’s tree? Why?
Moneres (bacteria) were classified very low (near the “trunk” of the tree); are prokaryotic and thus less imporant
What were the three major Kingdoms of Haekel’s tree of life?
Plantae, Protista (unicellular eukaryotes), Animalia
Describe the tree of life that has Kingdom Monera at the bottom (LUCA)
Monera < Protista < Plantae, Fungi, Animalia (each diverged from Protista)
Describe Carl Woese’s contributions to the tree of life
Quantitatively related organisms based on numerical data (shared 16S SSU rRNA genes) instead of morphology
Why were the 16S SSU rRNA genes used to relate organisms?
- universal to all organisms
- does not undergo horizontal gene transfer (conjugation)
- highly conserved (functionally constant and low mutation rate)
- number of mutations acts as a molecular clock to determine divergence time between species
What is the “current” hypothesis for the 3 Domain phylogeny?
Divergence at LUCA into Bacteria and the common ancestor between Archaea and Eukarya, which diverged into those respective branches
Describe the relatedness between Bacteria, Archaea, and Eukarya
Bacteria is the oldest, Archaea and Eukarya are most closely related
What were the overall contributions of Carl Woese?
- quantitative approach to the tree of life
- discovered Archaea
- proposed the 3 Domain system instead of the 5 Kingdoms
- implied that prokaryotes are not current evolutionary artifacts; that have evolved alongside us
What is the trend in the number of bacteria species discovered versus time?
Has increased exponentially in part due to next gen sequencing techniques
How many phyla of bacteria have we discovered thus far?
~80, although 1500 is the predicted number
What are the four major phyla in Bacteria? What makes them “major”?
- Proteobacteria
- Actinobacteria
- Fermicutes
- Bacteroidetes
80% of characterized genera (cultured in lab) belong to these phyla
If more phyla could be comfortably cultured, how would our current major phyla change?
If more could be cultured, there would be more than 80 phyla and likely a greater makeup of “major” genera (our current ones would likely be diluted)
What is the most diverse and abundant phyla of cultured bacteria? Why?
Proteobacteria; can survive in culture, have many different metabolic strategies (anoxygenic, chemotrophs, autotrophs, lithotrophs, symbiotic, planktonic, etc.)
What are the classes of Proteobacteria?
Alpha, Beta, Gamma, Delta, Epsilon, and Zeta
Compare the divergence of plastids and mitochondria. Where did mitochondria come from?
Plastids diverged much sooner than mitochondria did; mitochondria diverged from Alphaproteobacteria
Describe the polyphasic approach to describe species
- morphology (cell shape, visible structures, chemical composition, etc.)
- metabolism (energy source [organo-, photo-, litho-, chemo-], carbon source [hetero-, auto-], oxygen requirements, etc.)
- genotype (use DDH or ANI to compare genomes of related species)
- evolution (tree constructed based on SSU rRNA)
How is DDH used to compare genomes of related species?
Known and unknown DNA is hybridized and the degree to which they hybridize determines their relatedness (more hybridized = closer relation)
What is the ANI % cutoff for different species?
Anything below 93% is considered a different species
What is the 16S rRNA sequence similarity % cutoff for different species?
Anything below 97% is a different species
What % of ANI and 16S rRNA must organisms share to be considered the same species?
ANI: above 96%
16S rRNA: above 98.5%
A high 16S rRNA gene sequence indicates what about the two organisms being compared?
They are close evolutionary neighbours (diverged recently)
What are the three requirements for the formal validation of a new prokaryotic species?
- Detailed description of characteristics/traits (morphology, metabolism, genotype, evolution)
- Deposition of viable cultures of the organism in at least two international culture collections
- Proposal of a Latin name and publication in the IJSEM (must be this journal)
How does the prokaryotic taxonomic hierarchy differ from the eukaryotic one?
Lacks Kingdoms, replaces them with phyla
What does the genome of a bacteria include?
The chromosomal DNA and plasmids
What is meant by the suffix -ome?
Implies global, collective, totality
What are the four basic -omic sciences?
DNA (genome), RNA (transcriptome), proteins (proteome), and metabolism (metabolome)
What is meant by the prefix meta-?
Implies beyond, more transcending, usually looks at omics of a microbial community rather than a single strain (analysis of at least two genomes
What is genome annotation?
Converting raw sequence data into a list of genes and other functional sequences present in the genome
What is bioinformatics?
Analyzing sequences and structures of nucleic acids and proteins
What are the steps for genomics?
- Sequencing
- Genome assembly
- Genome annotation
- Bioinformatics
Which step within genomics is considered the bottleneck of the process? Why?
Bioinformatics; slowest process because it takes the most work (analyze data)
What are the two sequencing techniques and how do they compare?
Sanger: more time consuming, less bases at a time
Next-gen: greater throughput at a much lower (100,000x) cost, faster
What is a closed vs. a draft genome?
Closed: every bp is known and sequenced, more expensive
Draft: most of the genome is sequenced except for repeats. Only know enough to distinguish between species
What is a contig?
A consensus sequence
What are ORFs?
DNA regions between a start and stop codon predicted to be read by ribosomes on mRNA. Identifying them requires searching 6 reading frames
How are the functions of ORFs predicted?
Searching similar sequences in DNA databases such as GenBank using BLAST
What are the steps for computer identification of possible ORFs?
- Computer finds possible start codons
- Computer finds possible stop codons
- Computer counts codons between start and stop (filters out shorter sequences)
- Computer finds possible RBS (upstream of start)
- Computer calculates codon bias in ORF
- Computer decides if ORF is likely to be genuine
- List of probable ORFs
What is codon bias?
The tendency of a on organism’s DNA sequence to use a degenerate codon for an amino acid (ex. E. coli prefers CGU, while fruit flies prefer CGC)
How is codon bias used to distinguish genomic regions?
Some species will have a preference for certain codons for an amino acid, so those codons are more likely to appear in ORFs
What percentage of total ORFs detected have a clearly identified function? Why?
70% or less, as many genes are misidentified
What are hypothetical proteins?
Uncharacterized ORFs that encode proteins that likely exist but whose function is currently unknown
What is a major cause of hypothetical proteins?
Lack sufficient amino acid sequence homology with known proteins for identification (cannot be compared to known proteins)
When are hypothetical proteins most common?
Common in genomes of uncultured environmental bacteria, as they cannot be intensively studied
Errors via the annotation telephone are caused by what?
Predicting the function of a gene based on similarity to genes that encode characterized enzymed
What evidence suggests horizontal gene flow is common amongst prokaryotes?
- presence of genes typically found in only distantly related species
- presence of a DNA with GC content or codon bias that differs significantly from the rest of the genome
- presence of mobilome genes (transposons, integrases, insertion sequences)
- often encode resistance, virulence functions (non-essential genes)
- often occur in clusters known as Genomic Islands within the genome
What are the three pathways of horizontal gene transfer?
Transformation, Transduction, and Conjugation
How many genes can be located in 1Mb of DNA? 1 kb?
1000 genes, 1 gene
What is the trend in ORF number as genome size increases?
Positive linear
What does a small genome size (~120,000bp) imply about the lifestyle of the organism?
It is endosymbiotic and dependent on other organisms for survival (may not encode its own ribosomes, etc)
What does a medium genome size (~600,000 - 1,000,000bp) imply about the lifestyle of the organism?
It is parasitic and depends on other organisms for the completion of its life cycle, but may be able to survive for short periods of time on its own
What does a large genome size (>1,200,000bp) imply about the lifestyle of the organism?
It is a free-living organism that is independent
Why are metagenomes so complex?
Since they compare multiple organisms, there’s lots of repeats in different genomes. Closely related strains are particularly problematic. Also there is commonly low coverage of rare genomes (many sequencing gaps)
What do cells need to know?
- water
- carbon source
- macro + micronutrients
- energy source
- reducing power
What are the two types of carbon sourcing? Explain them
- heterotrophs acquire carbon from existing organic molecules
- autotrophs acquire carbon from generating their own organic molecules out of CO2
What are the three types of energy sourcing? Explain them
- phototrophs acquire energy from light
- chemolithotrophs acquire energy from inorganic molecules
- chemoorganotrophs acquire energy from organic molecules
Catabolism
Energy-releasing metabolic functions that break down molecules
Anabolism
Energy-requiring metabolic functions that synthesize new molecules
Compare delta G to delta G knot
delta G describes cellular conditions where delta G knot describes the standardized conditions of a lab
Exergonic reactions
Negative G, release free energy, spontaneous
Endergonic reactions
Positive G, require free energy, non-spontaneous (require ATP)
What is reducing power?
The ability to donate electrons during a reaction. The more negative the redox potential, the better the molecule will lose electrons (become oxidized) and vice versa
What is the best reducing agent? Why?
Glucose. Has the most negative E
What is the best oxidizing agent? Why?
Oxygen. Has the most positive E
An electron donor is:
Oxidized, a reducing agent
An electron acceptor is:
Reduced, an oxidizing agent
How do you calculate the delta E of a redox reaction?
E (reducing agent) - E (oxidizing agent) = delta E
Compare coenzymes and cofactors
Coenzymes are organic molecules used to aid enzymes in their reactions. Cofactors are metallic ions used to aid enzymes in their reactions. They are not seen as reactants
What are three good energy conservation molecules? Why?
PEP (phosphoenolpyruvate), ATP, and Acetyl-CoA; they have the most negative delta G
What are the three energy conservation mechanisms?
- substrate-level phosphorylation
- oxidative phosphorylation
- photophosphorylation
What is substrate-level phosphorylation?
Phosphate groups from organic molecules are transferred to ADP
What is oxidative phosphorylation?
Electron flow generates PMF for chemiosmosis using inorganic phosphates to make ATP (ATP synthase)
What is photophosphorylation?
Photons (light) power the formation of PMF for chemiosmosis