Module 7: Microbial Genomics (Variability) Flashcards
Comparative Genomics
The study of evolutionary relationships among organisms based upon DNA sequence (using genomic tools)
Comparative Genomics provides insights into (5):
1) Phylogeny of all life
2) Relationships between species
3) Fundamental processes influencing microbial diversity and evolution
4) Differences between strains of a species
5) Identification of genes for virulence + pathogenicity
Homologs
A gene that has a common evolutionary descent (same ancestral gene) to another gene (Evolutionarily linked genes)
Duplication Events
Types of mutations in which a region of DNA containing a gene is replicated
== 2 copies of the same gene occur within the genome at once!
Paralogs
Homologs that arose from a duplication event of an ancestral gene WITHIN a lineage
How are paralogs able to form from duplication events?
Because when two of the same gene exist within the genome, only ONE must carry out its original function for a given cell to survive
–> As such, one of the copies is “free” to evolve a new function
Paralog Families
Genes (paralogs) with a similar function but have different substrates they act upon and thus different specific products
What is an example of a paralog family?
ABC transporters
–> All of them have a similar function/are same KIND of protein but each will allow for the transport of different molecules
Orthologs
Homologs (related genes) that have evolved from the same ancestor and have the SAME function in TWO DIFFERENT species
–> Became separated via speciation!
What genes of two species genomes are assumed to be orthologs?
How do we truly test for orthologs?
Genes in 2 genomes with HIGH levels of sequence similarity
True test = whether or not the 2 genes have the same function!
What is an example of paralogs vs example of orthologs (with dehydrogenases)?
Paralogs = Malate dehydrogenase + Lactate dehydrogenase in the SAME genome!
(Many cells have multiple dehydrogenases that act upon different substrates as a result of evolution from duplication event)
Orthologs = Malate dehydrogenase + malate dehydrogenase in two different species genomes!
Horizontal Gene Transfer
HGT
== Sharing of genetic info by microbes
Why are genomes considered as “mosaics”?
Because current genomes have arisen via both evolutionary changes and horizontal gene transfers
What is a main piece of evidence that HGT may have occurred between species?
G-C Content (% GC)
(Genomic Base Pair Composition)
Genomic Base Pair Composition
The ratio or proportion of A-T and G-C base pairs out of the total number of base pairs
(An indicator that a gene or genomic region may have been transferred via HGT)
% GC
(GC content)
The % of the total # of base pairs that are GC in the entire genome
Why can GC content be used to determine if HGT may have occurred between 2 species?
Because GC content is typically unique per genome + does not vary much WITHIN a singe genome
(= any region of a genome that has a different GC content than the majority of the genome is likely not from the organism)
Examples of GC content values for E.coli, Streptomyces, and S. cerivisiae
E. coli = 50% GC
Streptomyces = 72% GC
S. cerivisiae = 38% GC
Genes/regions exhibiting a significant difference in GC content could indicate…
Genes/regions exhibiting a significant difference in GC content could indicate that the specific gene had a distinct evolutionary history from the rest of the genome (and as such was likely transferred)
If Strain A (50% GC) and Strain B (70% GC) both have GENE X (and their versions of this gene have a similar sequence) BUT gene X in Strain B has GC content = 50%. What does this suggest?
Suggests that:
1) Gene X has a distinct evolutionary history == likely not from this organism == Likely a transferred gene!
2) Gene X was likely transferred to Strain B from Strain A!
Other than GC content, what other pieces of evidence are there for HGT events?
1) Differences in nucleotide pair patterns
2) Differences in codon usage patterns
3) Presence of repetitive sequences
4) Gene phylogeny
(When evolutionary relationship predicted by SSU rRNA + DNA sequence are NOT matching)
What is a limitation of predicted HGT with GC content analysis?
It may lead to a lack of recognition of HGT occurring between genomes with similar GC content
–> GC content does NOT need to be anomolous in order for HGT to occur
GC content does NOT need to be ________________ for HGT to occur
In actuality…
–> GC content does NOT need to be anomolous in order for HGT to occur
In actuality, HGT is most successful between genomes with similar GC contents!
Genomic Islands
DNA segments of >10-200 Kb that are transferred from one species to another
What are genomic islands typically associated with?
1) tRNA genes
2) Transposable elements
3) Plasmids/bacteriophages
How were genomic islands discovered?
(What was observed?)
They were discovered during comparisons of sequences of related microbes
== Observed many large regions of DNA that were completely present in one of the related species while completely absent in the other
–> Suggests transfer of DNA!
A single genome sequence does NOT exist for microbe species because…
Extensive HGT = highly variable genome!
Metagenomics
Process by which DNA is extracted directly from microbial communities and analyzed as a composite mixture
–> Application of genomic tools for study of microbial communities
What are metagenomic methods typically used for?
To study the genomes + features of UNCULTIVATED microbes
What are some limitations of metagenomics?
1) Targets only a SUBSET of a microbial community (too big of a task to analyze each independent genome in a collected culture)
2) Does not lend itself to confident predictions (of the uncultivated microbes being studied)
3) Does not provide info on microbial interactions (within the community being analyzed)
In what way do metagenomics NOT lend to “confident predictions”?
For uncultivated microbes, their DNA sequences (collected via metagenomics) is not enough to produce predictions about their physical features
What are benefits of metagenomics? (2)
1) Functional Metagenomics (study of gene functions within a metagenomic library)
(Leads to…)
2) Discovery of novel enzymes (many of which have applications in biotech)
Functional Metagenomics
Experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations
–> Clones in the metagenomic library are SCREENED for specific enzymatic functions or proteins produced!
What is Single-Cell Genomics and FACS?
Single-Cell Genomics = Field that studies the unique characteristics and genome of individual cells
FACS = Fluorescently Activated Cell Sorting
–> A method (used in single-cell genomics) that can be used to recover individual uncultivated cells to amplify and sequence their genomes
For what microbes is FACS utilized?
For microbes detected via only single genes!
Define Metatranscriptomics + Metaproteomics
Metatranscriptomics = DIRECT analysis of RNA from the environment (no cloning)
Metaproteomics = Analysis of environmental proteins DIRECTLY (no cloning)