GWAS and How We Can Use It To Better Understand Bacteria Flashcards

Question 1

Q

What is Genome Wide Association Study

Answer

A

A GWAS tests genetic variants across the genome of many individuals to identify associations between specific genetic loci (mutations, insertion, deletions) and phenotypic traits, including diseases

Question 2

Q

What is the key principle behind GWAS

Answer

A

Genetic variants that contribute to a trait or disease will occur more frequently in individuals with that trait (cases) than in those without (controls)

Question 3

Q

What technologies are used in GWAS

Answer

A

Whole Genome sequencing
DNA microarrays

Both are used to identify common variants across individuals

Question 4

Q

How is data analysed in GWAS

Answer

A

Statistical analysis compares allele frequencies between cases and controls. Variants significantly more frequent in cases are flagged as potentially linked to the trait

Question 5

Q

What is the significance of the p-value in GWAS

Answer

A

The p-value reflects how likely it is that the association between a genetic variant and the trait is due to chance. Lower p-values suggest stronger associations

Question 6

Q

What is a haplotype and its role in the GWAS

Answer

A

A haplotype is a group of genes inherited together. A variant within a haplotype associated with a disease may appear frequently in cases, even if its not the direct cause.

Question 7

Q

How can GWAS be used for investigating bacterial virulence, antibiotic resistance, outcome predictions in infectious diseases

Answer

A

Can be used to investigate:
- What makes some bacterial strains more dangerous (virulent)

Why some are resistant to antibiotics
Whether we can predict clinical outcomes based on bacterial genetics

Question 8

Q

How are these genomes sequences first collected

Answer

A

A lot of bacterial samples from different infections have its whole genome sequenced - these are the genotypes (the genetic content of each bacterial strain)

Question 9

Q

How is this genetic data the defined

Answer

A

By its phenotype:
- The pathogenic phenotype/genetic basis of pathogenicity. This would help in identifying virulence factors - the gene responsible for the ability to cause disease.

antibiotic resistance/genetic basis of resistance. Which strains are resistant to antibiotics. Which genes or mutations are responsible. Can we predict resistance from the genome without growing bacteria in the lab.

Question 10

Q

How does GWAS play a role bacterial infections

Answer

A

GWAS takes the genotype and phenotype data and runs statistical comparisons to find associations between specific genetic features:
- SNPs
- Genes
- Plasmids
and the traits of interest:
- Resistance
- Severity
- Immune evasion

Question 11

Q

Why is WGAS powerful

Answer

A

Discovery of new resistance or virulence genes

Real-time surveillance during outbreaks - tracking the spread of a strain of a resistant clone

Personalised treatment decisions - choosing the best antibiotic based on the bacteria’s genome

Question 12

Q

How is the genome database for these bacteria developing

Answer

A

Pace of sequencing is still increasing; amount of sequence deposited roughly doubles every 18 months whereas with
continued technological development the
cost per sequence is decreasing

Question 13

Q

What are prerequisites for a successful GWAS

Answer

A

A testable phenotype: this could be binary (yes/no), or quantitative (MIC/how much toxin does this strain produce)
WGS bacterial isolate: the more related these strains are the less interference of population structure
Phenotype must be scalable: its hard to test a phenotype on thousands of strains so GWAS tends to focus on high-throughput phenotypes - things you can measure quickly and automatically in the lab
Effect size: a measure of how strongly a genetic variant is associated with a trait (if a single mutation completely explains a trait - it has a large effect size which is ideal for GWAS)

Question 14

Q

What does it mean by less interference of population structure

Answer

A

There are fewer false associations in the GWAS results caused by genetic relatedness between bacterial strains

You would minimise background noise from unrelated mutations that have nothing to do with the trait you’re studying

You are more likely to find the true genetic causes of the phenotypes you’re testing

Question 15

Q

What is linkage disequilibrium

Answer

A

LD refers to how often two genetic variants are inherited together on the same stretch of DNA (haplotype block)

In humans, recombination shuffled genes during reproduction, breaking up these haplotypes over time

So in human GWAS, causal variants can often be distinguished from those physically close to the variant because of recombination.

Question 16

Q

WHy are GWAS results more challenging in bacteria compared to humans

Answer

A

Bacteria reproduce asexually, so they don’t recombine DNA as frequently as humans.

This means new mutations stay linked to large chunks of the genome, making it harder to pinpoint the actual causal mutation.

Without frequent mixing (recombination), many traits appear genetically linked just due to shared ancestry, not actual causation.

Question 17

Q

How does recombination help identify the true cause of a trait in GWAS?

Answer

A

Recombination breaks up long DNA blocks into smaller ones, allowing scientists to separate the causal variant from other nearby, non-causal mutations.

In humans, this happens naturally and frequently.

In bacteria, recombination is rare, so it’s harder to rule out false positives (mutations that are inherited together but not functionally linked to the trait).

Question 18

Q

How does population structure interfere with GWAS in bacteria

Answer

A

Closely related bacteria share large chunks of identical DNA.

If a trait is common in one lineage, you may mistakenly associate it with many shared mutations, even if only one is responsible.

This is why understanding genetic backgrounds and controlling for population structure is crucial in bacterial GWAS.

Question 19

Q

What is a homoplasious mutation

Answer

A

A mutation ath occurs repeatedly at the same site; e.g., bacterial strains could share the same mutation at a particular genomic location not through common ancestry but because the variant arose independently

Question 20

Q

What are the 3 mechanisms by which homoplasious mutations can be introduced into the genomes of bacterial populations

Answer

A

HGT
Recombination
Recurrent mutations

Question 21

Q

Why is population structure important in bacterial GWAS

Answer

A

Because bacteria reproduce clonally, meaning all genetic variants in a lineage are inherited together

As a result it is difficult to tell if a mutation causes a trait or is just linked due to a shared ancestry

Without proper control for population structure, you risk identifying false associations

Question 22

Q

How does lack of recombination affect GWAS

Answer

A

Due to the lack of recombination in bacteria, all fixed mutations in a lineage are passed on together in a linkage disequilibrium.

If a phenotype is present in a lineage, many linked mutation may appear associate, even if only one is causal

Question 23

Q

What is a Linear Mixed Model and how does it help in GWAS

Answer

A

LLMs are statistical models that account for relatedness between bacterial strains

They help control for population structure by modelling the background genetic similarities across strains

This improves the ability to detect true associations between specific lovi and phenotypes

Question 24

Q

What can LLMs help identify in GWAS

Answer

A

Locus specific effects: mutations truly linked to the phenotype
Lineage-level differences: broader patterns seen in entire strain groups
Helps separate trait-causing mutations from those just carried by related bacteria

Question 25

Q

What is Vancomycin-Intermediate S. aureus (VISA)

Answer

A

VISA is a form of S. aureus with intermediate resistance to vancomycin, a last line antibiotic

Resistance evolves gradually through mutations in multiple genes

Question 26

Q

What did the VISA GWAS study investigate

Answer

A

To identify genetic variants associated with vancomycin resistance
It was compared with 49 vancomycin-sensitive (VSSA) and 26 VISA strains
It analysed over 50,000 high quality SNPs across the 74 strains.

Question 27

Q

Why were many SNPs in the VISa GWAS not useful

Answer

A

Many SNPs were ‘fixed’ and lineage specific, meaning they were shared among closely related bacteria, not necessarily related to resistance

These are structured by population, so they confound association analysis - they don’t distinguish between resistant and sensitive strains

Question 28

Q

What was the key SNP identified in VISA GWAS

Answer

A

A non-synonymous mutation at codon 481 of the rpoB gene

Strongly associated with increased vancomycin MIC

Previously shown to contribute to vancomycin resistance in other studies

Question 29

Q

Why is it important to study resistance mechanisms in M. tb

Answer

A

Mechanisms of resistance are still incompletely understood

Identifying biomarkers of drug resistance can help with faster diagnosis and better treatment

Genome sequencing and GWAS can link genetic mutations to drug resistance

Question 30

Q

What was the design of the GWAS for M.tb drug resistance

Answer

A

Compared genomes of resistant vs. sensitive strains

Applied GWAS, phylogenetic analysis, and statistical testing to find variants associated with resistance

Question 31

Q

What is evolutionary convergence in antibiotic resistance

Answer

A

It refers to the independent emergence of the same resistance mutation in different M. tb lineages
Suggests that certain mutations provide a strong selective advantage under antibiotic pressure

Question 32

Q

What is the purpose of the phylogenetic convergence test (phyC)

Answer

A

A statistical method used to detect repeated mutations that appear more often in resistant strains than sensitive ones

Helps identify true resistance mutations versus background variation

Question 33

Q

What novel findings did the GWAS in M. tb reveal

Answer

A

Found positive selection in 39 additional genomic regions among resistant strains

Of these 11 had known functions, potentially involved in resistance mechanisms

Question 34

Q

What role does the ponA1 gene mutation play in rifampicin resistance

Answer

A

ponA1 involved in peptidoglycan homeostasis

This mutation in ponA1 conferred a fitness advantage in the presence of rifampicin

This is located near the transpeptidase catalytic site, suggesting it may disrupt enzymatic activity

Question 35

Q

What do these results suggest about the evolution of M. tb resistance

Answer

A

Drug resistance can evolve through a complex, stepwise process
May involve cell wall remodelling and compensatory mutations for fitness under drug pressure
Shows that resistance is not just about target mutations but may involve multiple pathways

Question 36

Q

What are the three main virulence phenotypes of S. aureus

Answer

A

Adhesion
Toxicity
Immune evasion

Question 37

Q

What is the principle behind S. aureus’ pathogenicity

Answer

A

Its pathogenicity is tightly regulated by multiple layers of regulatory systems. These systems make sure that virulence factors are only expressed when needed, helping the bacteria survive in different environments (skin, bloodstream, or inside immune cells)

Question 38

Q

What is the Two-Component System (TCS)

Answer

A

These are signal transduction systems that allow bacteria to sense and respond to environmental changes

Question 39

Q

What does a typical TCS have

Answer

A

1) A sensor kinase (membrane bound, detects an external signal)

2) A response regulator (activates or represses gene expression)

Question 40

Q

What are the different TCSs in S. aureus in control of

Answer

A

It has about 16 TCSs that control different aspects of behaviour like:
- Virulence

Antibiotic resistance
Biofilm formation
Metabolism

Question 41

Q

What is an example of a virulence TCSs

Answer

A

Agr system - Accessory Gene regulator
- Master regulator of virulence in S. aureus

Controls expression of toxins, enzymes, and surface proteins
Works through quorum sensing - detecting population density and activating virulence at high cell density

Question 42

Q

What are transcription factors

Answer

A

Proteins that bind DNA and regulate the transcription of specific genes - can either activate or repress gene expression

What are the 3 transcription families in S. aureus:
- Sar family

CodY
SigB

Question 43

Q

What does the Sar family include

Answer

A

SarA, SarR, SarS

Regulates agr and many virulence genes directly

Can fine tune gene expression in response to environmental cues

Question 44

Q

What does CodY transcription family include

Answer

A

Represses virulence genes under nutrient-rich conditions
Acts as a metabolic sensor- when nutrients are scarce, repression is lifted and virulence genes are expressed.

Question 45

Q

What does the SigB family include

Answer

A

An alternative sigma factor that helps the cell respond to stress

Promotes survival under harsh conditions

Also controls genes involved in virulence and biofilm formation

Question 46

Q

What is the CIpXP Protease System

Answer

A

This is a protein degradation system

CIpXP is an ATP-dependent chaperone that unfolds proteins, and CipP is the protease that degrades them

Question 47

Q

What is the role of CIpXP in S aureus

Answer

A

It helps maintain protein quality control, especially under stress

It also degrades regulatory proteins, affecting the stability and activity of transcription factors and other regulators

Impacts stress response, virulence, and antibiotic resistance

Question 48

Q

What are sRNAs (Small regulatory RNAs)

Answer

A

These are short non-coding RNA molecules that regulate gene expression at the post-transcriptional level.

Question 49

Q

What are sRNAs function in S. aureus

Answer

A

Bind to mRNA to block translation or promote degradation

Can also stabilize certain mRNAs depending on the situation

Important for fine-tuning gene expression in response to stress, growth stage, or environmental signals

Question 50

Q

What is an example of an sRNA

Answer

A

RNAIII: this is an effector of the Agr system - regulates toxin and adhesin expression by base pairing with target mRNAs

Question 51

Q