Bacterial population genetics Flashcards
How to analyse the population structure of bacteria?
Construct phylogenetic trees from DNA sequences of bacterial strains with different phenotypes.
Observe clustering (be cautious as may represent physical barriers of selection)
Methods to identify selecton:
- dn/ds
- Homoplasy
- functional homology
What does clustering on the phylogenetic tree represent?
Adapted lineages: Clustering may represent selection to specific niches/ hosts.
Physical barriers and bottlenecks: Clustering may represent genetic bottlenecks and physical barriers to colonization resulting genetic drift.
Theoretical models for molecular evolution in biology
Neutral diversification
- Most genetic variation can be explained by genetic drift
- Clusters can form due to transmission barriers
Ecotype model
- Genetic variation can be accounted for by selection resulting in adapted lineages in a given environment.
- Clusters form due to adaptation to niches.
- HGT make the model more complicated and distrupt clustering.
Processes behind bacterial adapatation
Recombination (HGT): Genetic material is aquired from external sources and incorperated into the chromosome
- Transformation
- conjugation
- transduction
Mutation: DNA replication errors create variation
Nucleotide substitution genic-> Novel protein Isoforms
Indel (insertion/ deletion) -> loss of function
Nucleotide substitution intergenic -> variation in gene expression, activation/ inactivation of gene
The relative rates of mutation and recombination are fundamental in shaping bacterial population genetic structure.
Hympermutator phenotypes
Strong selection pressures can (antibiotic resistance/ change in host) hypermutator phenotypes can emerge which have elevated mutation rates and therefore allow for elevated adaptation.
Deactivation mutations in genes invoked in repair
Example: Cystic fibrosis antibiotics create a strong selection pressure within infected hosts leading to hypermutator bacteria which can lead to lung infections.
- mutation in gene involved in repair -> mutS
- decreased virulence but increased mutation rate
- increased rate of switch from acute to chronic -> requires key mutations (Switch genes)
Population models
Clonal populations model: bacteria can be traced back to a single ancestor and there is no HGT.
non- clonal population models: Bacteria can be traced back to multiple ancestors due to HGT/ recombination mixing the lineages.
Different levels of clonal signal are observed in different populations depending on the relative rates of mutations and recombinations.
Strictly clonal: M. tuberculosis / staphylo coccus aureus
Fully non clonal: Helicobacter pylori
Genomic change is not always a signal of adaptation…
Not all genetic variation is adaptive and it may reflect bottlenecking/ drift due to isolation from ancestral gene pool.
Genomic changes may also reflect the timescale of host colonization. Recent conolonisation may be associated with new opportunities for HGT (e.g. Mobile genetic elements) while more ancient lineages may show evidence of reductive evolution.
Methods to detect selection
- dN/dS
- functional homology
- homoplast
Detecting selection: dN/dS
This methods compares the frequency of substitutions at synonymous sites (dS), which are presumed neutral (silent), with that at non-synonymous sites (dN), which result in amino acid-replacing codon changes and may be subject to selection.
- A dN/dS <1 is associated with negative or purifying selection, which suppresses protein changes
- A dN/dS >1 is associated with positive selection, which promotes changes in the protein sequence.
- dN/dS =1: Drift
Frequenctly used to look at selection for within host populations
- Strong positive selection away from immune system and drift due to bottleneck
- In free living population tends to be more purifying selection unless the environment has strong selective pressures.
Limitations
- Assumes that selection is just working on protein coding sequence. Does not consider selection on gene order, distribution of sequences, codon usage (synonymous mutations may be under selection)
- Does not consider interactions between genes (e.g. closely linked genes may have similair dN/dS value whether they have functional advatage or not)
- NS can be interpreted as S due to frame shift/ incorrect identiifcaiton of start codon.
- If making population wide calculation can be inacurate as lineages are undergo independent purging of founder sequences.
Example:
- Dn/Ds method used to show that there is positive selection on heteroplastic mitochondrial mutations during breast cancer progression.
- MtDNA mutations are involved in cell proliferation in breast cancer.
Detecting selection: Functional homology
You can look at the pathogenic version and compare it to the commensal (non pathogen/ ancestor) version to identify the functions adapted to pathogenic lifestyle (genomic changes linked to pathogenic emergence = selected for)
pathogenicity islands: Chunks of the genome that confer functionality related to a pathogenic lifestyle.
- Envodie viruslence factors, protein secretion systems,
Example: CagA found on a 40kb PAI in Helicobacter pylori
Example: LEE (Locus of Enterocyte Effacement) in some E.coli strains allows attatching and efacement lesion on eneterocyte cells into the gut -> necesary for infection
Detecting selection: Homoplasy
Homoplasy: similair trait in different species that may be under similair selection pressures.
Homoplasy can be a signal that the similarity is not due to shared ancestry (homology) but rather convergent evolution or parallel evolution
Example: Staphylococcus aureus
- Pathogen found in humans and chickens but only causes disease and death in chickens
- Convergent evolution of S.aurus pathogenicity island and phage related genes in divergent lineages. -> An outgroup chicken lineage has convergent pathenisity phenotype
Method: SNPPar is used for efficient detection and analysis of homoplasic SNPs -> combined with WGAS that sequence genomes on many strains and compare.
- Exmaple: 3 sampled datasets ( Elizabethkingia anophelis , Burkholderia dolosa and M. tuberculosis ) underwent SNPPAR analyses there was evidence of both individual homoplasies and evidence of convergence at the codon and gene levels
Detecting selection: GEnome wide association studies
Lots of genomes are sequenced in two different conditions (e.g. with Alzheimer’s and without) and sequences that are over-represented in one group over another are identified
Overview
There are different models for molecular evolution in bacteria
- Neutral model
- ecotype model
Mutation and HGT generate variation in bacterial genomes and lead to adaptation.
This occurs to varying degrees leading to different models
- Clonal vs non-clonal
Must be cautious when interppretting clustering on trees as may be due to selection/ genetic bottle necks/ isolation.
Identifying adaptive variation in the genome
o dN/dS
o Functional homology and PAI’s
o Homoplasy