GWAS and How We Can Use It To Better Understand Bacteria Flashcards

1
Q

What is Genome Wide Association Study

A

A GWAS tests genetic variants across the genome of many individuals to identify associations between specific genetic loci (mutations, insertion, deletions) and phenotypic traits, including diseases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the key principle behind GWAS

A

Genetic variants that contribute to a trait or disease will occur more frequently in individuals with that trait (cases) than in those without (controls)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What technologies are used in GWAS

A
  • Whole Genome sequencing
  • DNA microarrays

Both are used to identify common variants across individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How is data analysed in GWAS

A

Statistical analysis compares allele frequencies between cases and controls. Variants significantly more frequent in cases are flagged as potentially linked to the trait

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the significance of the p-value in GWAS

A

The p-value reflects how likely it is that the association between a genetic variant and the trait is due to chance. Lower p-values suggest stronger associations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a haplotype and its role in the GWAS

A

A haplotype is a group of genes inherited together. A variant within a haplotype associated with a disease may appear frequently in cases, even if its not the direct cause.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can GWAS be used for investigating bacterial virulence, antibiotic resistance, outcome predictions in infectious diseases

A

Can be used to investigate:
- What makes some bacterial strains more dangerous (virulent)

  • Why some are resistant to antibiotics
  • Whether we can predict clinical outcomes based on bacterial genetics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How are these genomes sequences first collected

A

A lot of bacterial samples from different infections have its whole genome sequenced - these are the genotypes (the genetic content of each bacterial strain)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is this genetic data the defined

A

By its phenotype:
- The pathogenic phenotype/genetic basis of pathogenicity. This would help in identifying virulence factors - the gene responsible for the ability to cause disease.

  • antibiotic resistance/genetic basis of resistance. Which strains are resistant to antibiotics. Which genes or mutations are responsible. Can we predict resistance from the genome without growing bacteria in the lab.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does GWAS play a role bacterial infections

A

GWAS takes the genotype and phenotype data and runs statistical comparisons to find associations between specific genetic features:
- SNPs
- Genes
- Plasmids
and the traits of interest:
- Resistance
- Severity
- Immune evasion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is WGAS powerful

A

Discovery of new resistance or virulence genes

Real-time surveillance during outbreaks - tracking the spread of a strain of a resistant clone

Personalised treatment decisions - choosing the best antibiotic based on the bacteria’s genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is the genome database for these bacteria developing

A

Pace of sequencing is still increasing; amount of sequence deposited roughly doubles every 18 months whereas with
continued technological development the
cost per sequence is decreasing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are prerequisites for a successful GWAS

A
  • A testable phenotype: this could be binary (yes/no), or quantitative (MIC/how much toxin does this strain produce)
  • WGS bacterial isolate: the more related these strains are the less interference of population structure
  • Phenotype must be scalable: its hard to test a phenotype on thousands of strains so GWAS tends to focus on high-throughput phenotypes - things you can measure quickly and automatically in the lab
  • Effect size: a measure of how strongly a genetic variant is associated with a trait (if a single mutation completely explains a trait - it has a large effect size which is ideal for GWAS)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does it mean by less interference of population structure

A

There are fewer false associations in the GWAS results caused by genetic relatedness between bacterial strains

You would minimise background noise from unrelated mutations that have nothing to do with the trait you’re studying

You are more likely to find the true genetic causes of the phenotypes you’re testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is linkage disequilibrium

A

LD refers to how often two genetic variants are inherited together on the same stretch of DNA (haplotype block)

In humans, recombination shuffled genes during reproduction, breaking up these haplotypes over time

So in human GWAS, causal variants can often be distinguished from those physically close to the variant because of recombination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

WHy are GWAS results more challenging in bacteria compared to humans

A

Bacteria reproduce asexually, so they don’t recombine DNA as frequently as humans.

This means new mutations stay linked to large chunks of the genome, making it harder to pinpoint the actual causal mutation.

Without frequent mixing (recombination), many traits appear genetically linked just due to shared ancestry, not actual causation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How does recombination help identify the true cause of a trait in GWAS?

A

Recombination breaks up long DNA blocks into smaller ones, allowing scientists to separate the causal variant from other nearby, non-causal mutations.

In humans, this happens naturally and frequently.

In bacteria, recombination is rare, so it’s harder to rule out false positives (mutations that are inherited together but not functionally linked to the trait).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How does population structure interfere with GWAS in bacteria

A

Closely related bacteria share large chunks of identical DNA.

If a trait is common in one lineage, you may mistakenly associate it with many shared mutations, even if only one is responsible.

This is why understanding genetic backgrounds and controlling for population structure is crucial in bacterial GWAS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a homoplasious mutation

A

A mutation ath occurs repeatedly at the same site; e.g., bacterial strains could share the same mutation at a particular genomic location not through common ancestry but because the variant arose independently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the 3 mechanisms by which homoplasious mutations can be introduced into the genomes of bacterial populations

A
  • HGT
  • Recombination
  • Recurrent mutations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Why is population structure important in bacterial GWAS

A

Because bacteria reproduce clonally, meaning all genetic variants in a lineage are inherited together

As a result it is difficult to tell if a mutation causes a trait or is just linked due to a shared ancestry

Without proper control for population structure, you risk identifying false associations

22
Q

How does lack of recombination affect GWAS

A

Due to the lack of recombination in bacteria, all fixed mutations in a lineage are passed on together in a linkage disequilibrium.

If a phenotype is present in a lineage, many linked mutation may appear associate, even if only one is causal

23
Q

What is a Linear Mixed Model and how does it help in GWAS

A

LLMs are statistical models that account for relatedness between bacterial strains

They help control for population structure by modelling the background genetic similarities across strains

This improves the ability to detect true associations between specific lovi and phenotypes

24
Q

What can LLMs help identify in GWAS

A
  • Locus specific effects: mutations truly linked to the phenotype
  • Lineage-level differences: broader patterns seen in entire strain groups
  • Helps separate trait-causing mutations from those just carried by related bacteria
25
Q

What is Vancomycin-Intermediate S. aureus (VISA)

A

VISA is a form of S. aureus with intermediate resistance to vancomycin, a last line antibiotic

Resistance evolves gradually through mutations in multiple genes

26
Q

What did the VISA GWAS study investigate

A
  • To identify genetic variants associated with vancomycin resistance
  • It was compared with 49 vancomycin-sensitive (VSSA) and 26 VISA strains
  • It analysed over 50,000 high quality SNPs across the 74 strains.
27
Q

Why were many SNPs in the VISa GWAS not useful

A

Many SNPs were ‘fixed’ and lineage specific, meaning they were shared among closely related bacteria, not necessarily related to resistance

These are structured by population, so they confound association analysis - they don’t distinguish between resistant and sensitive strains

28
Q

What was the key SNP identified in VISA GWAS

A

A non-synonymous mutation at codon 481 of the rpoB gene

Strongly associated with increased vancomycin MIC

Previously shown to contribute to vancomycin resistance in other studies

29
Q

Why is it important to study resistance mechanisms in M. tb

A

Mechanisms of resistance are still incompletely understood

Identifying biomarkers of drug resistance can help with faster diagnosis and better treatment

Genome sequencing and GWAS can link genetic mutations to drug resistance

30
Q

What was the design of the GWAS for M.tb drug resistance

A

Compared genomes of resistant vs. sensitive strains

Applied GWAS, phylogenetic analysis, and statistical testing to find variants associated with resistance

31
Q

What is evolutionary convergence in antibiotic resistance

A
  • It refers to the independent emergence of the same resistance mutation in different M. tb lineages
  • Suggests that certain mutations provide a strong selective advantage under antibiotic pressure
32
Q

What is the purpose of the phylogenetic convergence test (phyC)

A

A statistical method used to detect repeated mutations that appear more often in resistant strains than sensitive ones

Helps identify true resistance mutations versus background variation

33
Q

What novel findings did the GWAS in M. tb reveal

A

Found positive selection in 39 additional genomic regions among resistant strains

Of these 11 had known functions, potentially involved in resistance mechanisms

34
Q

What role does the ponA1 gene mutation play in rifampicin resistance

A

ponA1 involved in peptidoglycan homeostasis

This mutation in ponA1 conferred a fitness advantage in the presence of rifampicin

This is located near the transpeptidase catalytic site, suggesting it may disrupt enzymatic activity

35
Q

What do these results suggest about the evolution of M. tb resistance

A
  • Drug resistance can evolve through a complex, stepwise process
  • May involve cell wall remodelling and compensatory mutations for fitness under drug pressure
  • Shows that resistance is not just about target mutations but may involve multiple pathways
36
Q

What are the three main virulence phenotypes of S. aureus

A
  • Adhesion
  • Toxicity
  • Immune evasion
37
Q

What is the principle behind S. aureus’ pathogenicity

A

Its pathogenicity is tightly regulated by multiple layers of regulatory systems. These systems make sure that virulence factors are only expressed when needed, helping the bacteria survive in different environments (skin, bloodstream, or inside immune cells)

38
Q

What is the Two-Component System (TCS)

A
  • These are signal transduction systems that allow bacteria to sense and respond to environmental changes
39
Q

What does a typical TCS have

A

1) A sensor kinase (membrane bound, detects an external signal)

2) A response regulator (activates or represses gene expression)

40
Q

What are the different TCSs in S. aureus in control of

A

It has about 16 TCSs that control different aspects of behaviour like:
- Virulence

  • Antibiotic resistance
  • Biofilm formation
  • Metabolism
41
Q

What is an example of a virulence TCSs

A

Agr system - Accessory Gene regulator
- Master regulator of virulence in S. aureus

  • Controls expression of toxins, enzymes, and surface proteins
  • Works through quorum sensing - detecting population density and activating virulence at high cell density
42
Q

What are transcription factors

A

Proteins that bind DNA and regulate the transcription of specific genes - can either activate or repress gene expression

What are the 3 transcription families in S. aureus:
- Sar family

  • CodY
  • SigB
43
Q

What does the Sar family include

A

SarA, SarR, SarS

Regulates agr and many virulence genes directly

Can fine tune gene expression in response to environmental cues

44
Q

What does CodY transcription family include

A
  • Represses virulence genes under nutrient-rich conditions
  • Acts as a metabolic sensor- when nutrients are scarce, repression is lifted and virulence genes are expressed.
45
Q

What does the SigB family include

A

An alternative sigma factor that helps the cell respond to stress

Promotes survival under harsh conditions

Also controls genes involved in virulence and biofilm formation

46
Q

What is the CIpXP Protease System

A

This is a protein degradation system

CIpXP is an ATP-dependent chaperone that unfolds proteins, and CipP is the protease that degrades them

47
Q

What is the role of CIpXP in S aureus

A

It helps maintain protein quality control, especially under stress

It also degrades regulatory proteins, affecting the stability and activity of transcription factors and other regulators

Impacts stress response, virulence, and antibiotic resistance

48
Q

What are sRNAs (Small regulatory RNAs)

A

These are short non-coding RNA molecules that regulate gene expression at the post-transcriptional level.

49
Q

What are sRNAs function in S. aureus

A

Bind to mRNA to block translation or promote degradation

Can also stabilize certain mRNAs depending on the situation

Important for fine-tuning gene expression in response to stress, growth stage, or environmental signals

50
Q

What is an example of an sRNA

A

RNAIII: this is an effector of the Agr system - regulates toxin and adhesin expression by base pairing with target mRNAs