Lecture 19- Pathogen genomics Flashcards
how do clinical and public health microbiology labs help?
characterise pathogens (what is it, what drugs can be used) investigate transmission and sources of infection (where did it come from, how can we stop its spread)
what can whole genome sequencing be used for? why is this better?
used for microbial diagnostics and surveillance
- identify pathogens
- predict drug resistance and virulence phenotypes
- investigate transmission and sources of infection
better-
- faster than huge barrage of traditional techniques
- gives us more information (understand how resistance is spreading and how pathogens are moving in population)
how do we get the sequence data? 2 ways
isolated organism e.g. bacterial culture
culture-free approaches
explain the bacterial culture method of getting sequence data
take sample you're interested in culture the organism on agar plate extract DNA from purified culture put in sequencing e.g. illumina - sequence data represents only the cultured organism
explain the culture free approach to extracting sequence data
take sample of interest
directly extra DNA from it and sequencing everything we have (eliminate culturing step)
- sequence data represents the population of organisms int he sample (not just the organism we want)
what is the drawback to using a culture free approach to getting sequence data?
might be tricky to work out which organism is causing the problem
- sequence data contains population of organisms including human, bacteria, viruses, fungi, parasites
what does the sequence data look like?
millions of short overlapping sequences
2 things we can do with the sequence data
- align the sequences to a reference
- de novo assembly (try to piece together original original DNA sequence)
* use these methods for downstream analysis
how to identify pathogens?
by aligning sequences to reference or de novo assembly of sequences
e.g. if E coli has a larger coverage than another bacteria when aligned to reference, the pathogen in the sample is more likely to be E.coli
what does the method of aligning to reference only work for?
purified cultures
why are metagenomic samples complex?
sequence data represents the population of organisms in the sample
- human, bacteria, viruses, fungi, parasites
- pathogens but also harmless commensal microbes
examples of identifying pathogens with metagenomic sequencing?
get the sequence data and align to malaria parasite genome ?
what can phenotype tell us?
observable characteristics
- morphology
- pathotype (e.g. HUS)
- resistance to a specific antibiotic (treatment failure likely)
what can genotype tell us?
genetic basis for phenotype
what are 2 ways bacteria can become resistant?
- spontaneous mutation (protein changes shape, drug can no longer bind to protein)
- acquisition of resistance genes (move around between bacterial strains and spread resistance)
genotype for HUS phenotype?
shiga toxin genes (carried by E coli)
genotype for resistance to fluoroquniolones
mutations in the gyrA or parC genes- spontaneous
genotype for resistance to chloramphenicol
chloramphenicol acetyltransferase gene- acquisition
how can we detect resistant genotype?
through aligning data to reference or de novo assembly
is genotype always a good predictor of phenotype?
no- not 100% concordant
why might there be a mismatch between genotype and phenotype?
- antimicrobial resistance genes are there but might not be expressed
- resistance proteins are being inhibited
- unknown resistance mechanisms
how to use phylogenetics to track outbreaks?
- take samples from subset of people
- sequence DNA of organism we think is causing disease
- align DNA, find variations (SNPs)
- assume that all these samples descended from recent common ancestor
- model DNA substitution (probability for the mutations, transversion and transition rates)
- take information and infer phylogeny
what can phylogeny tell us?
how the samples are related to each other
how the pathogen is being spread e.g. between hospitals
what do the tips of a phylogenic model represent?
the samples
what do internal nodes represent? what do the numbers mean?
hypothetical ancestor
numbers from 0-1
- high values, strong support (more confidence that the node is correct)
what do the horizontal branches mean?
amount of genetic change between samples
- long branch, more genetic change
- shorter branch, less genetic change
what do the vertical branches mean?
nothing- just help connect the samples
what does the root represent?
hypothetical ancestor of all samples
example of WGS comparisons of samples from patients in US, Aus and NZ plus from heater-cooler units
the isolates were grouped together
- very short branches separate clinical and heater-cooler unit samples
- suggests heater-cooler unit as source of global outbreak
global data sharing via?
GenomeTrakr
what was GenomeTrakr used for initially?
sharing genomes for species that cause food-borne disease
- public health data sharing initiative started by US food and drug administration
positives of GenomeTrackr?
publicly available
share genome data from labs around the world
can see how genome is moving around the world
limitation of GenomeTrackr?
need data storage space in database
challenges for using WGS in terms of identifying pathogens?
- culture based approaches still not fast enough (delay for diagnosing samples)
- culture free metagenomic samples are complex and suffer from contamination (human DNA)
challenges for using WGS in terms of detecting resistance?
genotype may not accurately reflect phenotypes, particularly if there is a novel resistance mechanism
challenges for using WGS in terms of tracking outbreaks?
- need to combine genomic data with epidemiological information (e.g. geographical info, spatial info, patient/equipment movements)
- large-scale comparisons have high data storage and computational requirements
challenges for WGS in terms of understanding results?
need specialist training to run analyses and interpret results