Bacterial Genomics Flashcards

1
Q

How do you generate a chromosomal library?

A
  1. Extract chromosomal DNA from strain
  2. Fragment DNA by digestion (each fragment can contain single gene, multiple genes, or part of gene)
  3. Each fragment cloned into vector(plasmid) [pSUM36 in example]
  4. Each plasmid transformed into bacteria and plated to grow into different colonies on medium (generally agar).
  5. Library = collection of colonies which each represent different pieces of genome of that strain
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How was a chromosomal library used in the Mycobacterium fortuitum study?

A
  • Gene that has mutation for resistance will transfer resistance to its vector
  • Plasmid will transfer resistance to the colony that is formed after transformation
  • Library could then be plated on strep+ selective media with lethal conc. of strep
    o Only resistant clones (with resistant gene) would be able to grow
  • Select those colonies that grow and isolate the plasmids from their cells. [pAC5, pAC6]
    o At this point you have isolated a few fragments of DNA that contains a gene resistant to Strep; but there might be multiple genes on it so you need to determine which is responsible
  • Sequence bases from each fragment (Sanger Sequencing) and find overlapping sequences ( = the position of the resistant gene)
  • Found 2.5kb region with 3 viable (Full) genes [orfB, orfC, orfD]
  • Researchers cut out 1kb fragment (orfC) and cloned into plasmid pSAN26
  • Original 2.5kb fragment (all 3 genes) cloned into plasmid pSAN19
  • Transformed both in bacteria on strep+ medium and measured conc of STR at which bacterial growth was stopped.
  • Both colonies survived at the same conc. showing that orfC was the gene responsible (When just it was present the same result was seen as when all 3 were present)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Steps in Sanger Sequencing

A
  1. DNA is digested and cloned into plasmids
  2. Extracted plasmid DNA is incubated with mixture of DNA Primers, Free bases, DNA Polymerase and ‘Terminator Bases’ (di-deoxynucleotides which are fluorescently labelled).
    a. Each di-deoxynucleotide base is labelled with a different color (e.g. A = Green etc.)
  3. Mixture heated to 96 °C – causing DNA to unravel
  4. Cooled to 50 °C – allows DNA primer to bind to plasmid DNA @ start of insert DNA
  5. Temp increased to 60 °C – DNA Polymerase binds to primer and adds bases until terminator base is added.
  6. Everything then reheated to 96 °C to separate new strand from original strand
  7. Rinse and Repeat – Forms fragments of every size ending with terminal (fluorescently labelled) bases.
  8. Fragments separated by size (length) via electrophoresis
    a. Capillary tube lowered into each well of plate and charge is applied which draws DNA through the porous gel in the tube (smaller go faster)
    b. At end of capillary is laser which causes terminator bases to light up
    c. Color of bases is detected by a camera and recorded.
  9. Because (assuming enough digestion occurred) fragments of every size exists, order of terminal bases to reach end of capillary tube should be the same as the order of bases from the DNA primer [i.e. the sequence of the insert DNA].
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Biochemistry of Sanger Sequencing

A

Relies on presence of chain terminating di-deoxynucleotides which interrupt DNA synthesis by blocking formation of phosphodiester linkages between incoming bases and the new strand.
o Replaces 3’ hydroxyl group of nucleotide with H (di-deoxy…).
o Hydroxyl group required for phosphodiester linkage, so synthesis is stopped.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the limitations of Sanger Sequencing?

A
  • Can only sequence between 300-1000 bp:
    o Quality isn’t good in first 15-40bp (Primer binding)
    o Sequence quality degrades after 700-900bp.
  • Relatively expensive.
  • Labor-intensive
  • Bias against genes toxic to host
    o [because of large insert size, full length genes could be included which would be expressed and kill host]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the steps in Illumina Sequencing? (Just name)

A
  1. Library Preparation
  2. Cluster Generation
  3. Sequencing
  4. Data Analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Illumina Sample Prep/Library preparation

A

a. Genome fragmented
i. Done because the instruments can only deal with shorter fragments

b. Adaptors/linkers attached to ends of DNA fragments (2 different oligonucleotide adaptors, one on each end)
i. Are just small fragments of DNA with a known sequence
ii. Added so we can manipulate the fragments of DNA directly (attaching to flow cell, annealing of primers etc.)

c. Reduced cycle amplification introduces additional motifs (e.g. sequencing binding site

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Illumina Cluster Generation

A

a. Fragments added to flow cell (glass slide with lanes). Each lane is a channel coated w 2 types of oligonucleotides (complimentary to the adaptors)
i. Called flow cell because different reagents are flowed over the surface during the reaction

b. Hybridization occurs by complimentary binding between an oligo- on the surface and the complimentary adaptor on the fragment.
i. Primer on flow cell is designed to be complimentary to the adaptors

c. Bridge Amplification (Clonal amplification of fragment):
i. Strand bends over and hybridizes to second type of oligo- in flow cell (complimentary to second adaptor on fragment)
ii. Polymerases generate complimentary strand -> dsDNA bridge
iii. Bridge is denatured -> 2 single stranded copies of DNA mol that are tethered to 2 different primers on the flow cell (In opposite orientations)

d. Reverse strands are cleaved and washed away -> leaving only the forward facing strands
i. Through different boiling/denaturing temps etc. for different sequences

e. Results in many copies of the same piece of DNA appearing in the same flow cell – appears as one much stronger signal (if fluoresced)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Illumina Sequencing Step

A

a. Primer binds to oligonucleotide (adaptor) at sequencing binding site (Primer made to be complimentary to adaptor)

b. Sequencing occurs with fluorescently labelled bases -> forms complimentary fluorescent strand
i. Bases also modified so only one base can be added at a time. One fluorescent base will be added to each fragment in the same flow cell (should be the same base) and then that color is read; then repeated.
ii. Then there’s a chemical reaction step that modifies the bases to allow the extension of the chain for the next base to be added. {REVERSIBLE TERMINATION}
iii. i.e. after each base is added the reaction is terminated while this occurs over each base in the cell and its read; then fluorescent signal cancelled and termination reversed, next base is added.

c. Strand excited with laser, and color sequence is captured.

d. Called ‘Sequencing by synthesis’
e. Read product (fluorescent strand) is washed away

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Illumina Data Analysis Step

A

a. Step 1: Genome assembly
b. Step 2: Alignment of reads to reference genome
c. De novo assembly: Assembling reads with no reference using only the overlapping sequences.
i. If you don’t have the reference genome to align fragments to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Illumina error rate and coverage

A

i. Higher error rate compared to Sanger Sequence
ii. Depends on coverage [the overlap between repeated regions in different reads]
iii. The higher the coverage in a particular region the more confident you can be in the assignment that is given.

e. Doesn’t deal well with repeat regions:
i. Fragments are quite small -> if sequence is repeated in multiple places in the genome you cant tell because both fragments would map to the same place

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is genome annotation?

A

‘Determining the structural and functional properties of the genome’

Structural – genes, promoters, pseudogenes, untranslated regions etc.
Functional – What role do the structural features play?
Additional elements – Origin of replication, mobile elements, pathogenicity islands etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What methods of genome annotation are there?

A
  1. Manual curation
    a. Most accurate but very slow
    b. Person using knowledge they have to curate a genome
  2. Automated computational pipelines
    a. Large amount of data available
    b. Relies on the accurate functional annotation of genomes in databases or could result in the propagation of errors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Steps in Genome Annotation

A
  1. Structural annotation:
    o Using info from outside the genome, from other organisms that are related and from the properties of the genome itself to determine the structural features.
  2. Functional Annotation:
    o Underlying assumption: similar/conserved sequences share the same function because they are related by ancestry
    o Homologue – 2 genes/proteins share the same ancestry
    • Identified by looking for similar/conserved sequences
    • Orthologue – homologues occurring in different species
    • Paralogue – homologues arising from duplication event
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is global alignment?

A

Method of pairwise sequence alignment

  • Matching as many positions as possible over the entire length
  • Comparing annotated sequence from database to the query sequence (“new”)
  • Looks for similar sequences of equal length
  • Compares every sequence in 1 sequence to the same position in the query sequence and finds matching sequences.
  • Not always useful because query sequence might not be the same length as the annotated sequence, so then can’t be used.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is local alignment?

A

Method of pairwise sequence alignment

  • Focus on the best matching regions of the sequence
  • Regions of similarity, need not be the same length
  • Match region to region rather than end to end
  • Useful for looking for similar genes in different organisms
    o Genomes may have very little similarity but a specific region may still be highly similar and related.
17
Q

What is BLAST?

A
  • Basic Local Alignment Search Tool
  • Common tool used for genome annotation
  • Compares a ‘Query sequence’ against a database and finds similarity
  • Web based interface
  • Output is numerical value – measure of how similar your query sequence/region of query sequence is to a sequence in the database
  • Not only used for genome annotation – often used to identify regions of similarity in query sequences.
18
Q

Shuttle Plasmids Genomic Information

A
  • 1 of the 2 ways to transform mycobacterial cells with DNA (CaCL2 doesn’t work)
    o Other one is using an electric pulse
  • Plasmids that have been modified by incorporating elements from mycobacterial viruses (specifically infect mycobacteria)
  • Portion of mycobacteriophage genome required for replication and packaging is cloned into E. coli vector with E. coli ORI
    o Allows phagemid to be manipulated in E. coli
    o In E. coli viral particles cannot be made – allows researcher to make lots of phagemids with this viral DNA
  • Phagemid then introduced into a mycobacterial host, phagemid can then produce viral particles which contains the phagemid DNA.
19
Q

Conditionally Replicating Mycobacteriophages Genomic Information

A
  • DNA modified so we can stall replication at non-permissive temps.
  • Phagemids incubated with Mycobacterial strain at different temps:
    o 30 degC allows replication of phagemid, viral particles produced,
    o Bacteria lyses and viral particles are released
  • Cultured at 42 degC
    o Prevents replication of the phagemid
20
Q

Phage Transposon Mutagenesis

A
  • Transposon inserted into the phagemid
  • Himar1: eukaryotic transposon – most used in these experiments
    o Has transposase and inverted repeats
    o When transposase expressed, produces enzyme which binds to inverted repeats and introduces ds breaks
    o Following excision, transposon can then be inserted at recognition site (TA bases for this transposon).
  • Process:
    o Phagemid with transposon introduced into mycobacterial host (M. smegmatis) and cultured at 42 degC -> no phagemid replication, no viral particles replicated, Because cell isn’t lysed we can see the effect of transposon insertion
    o Expression of transposase causes transposon to jump from phagemid into chromosome
    • Transposon can jump into every possible TA site within chromosomal genome – causing disruption of whichever gene it jumps into
    o Results in each mycobacterial colony on the slide representing a different mutant (different gene is interrupted by transposon insertion)
    o Forms a transposon mutant library
    • Selected by an antibiotic that is present on the transposon.
21
Q

What modifications do Himr1 have that make it useful for transposon mutagenesis?

A
  1. Kanamycin resistant gene incorporated for selection

2. T7 promoters incorporated to allow for mapping of insertion sites

22
Q

Process of insertion site mapping for a single transposon mutant using Himr1

A
  • In Himr1 transposons there are known internal sequences on flanks of transposon
  • Chromosomal DNA digested via restriction enzymes
  • Adapters ligated to ends of fragments
    o Allows addition of a specific sequence
  • PCR amplification performed using internal sequence in transposon and sequence of adapter
    o Allows specific amplification of DNA directly flanking the transposon (BLUE, transposon also amplified in fragment).
  • Generates ‘reads’
  • Height of each line represents the number of reads at that TA (Himr1 insertion site) site in the genome
    o i.e. how many different mutants there are that contain an insertion at each TA site in the library
  • Areas with little/no insertions predict essential genes under the selection conditions
    o Insertion into those genes caused mutant to have nonfunctional gene and die under selection conditions, so its absent in the mutant library
  • Could also be used to identify genes that resulted in growth defects or advantage
23
Q

What are Non-Tuberculosis Mycobacteria (NTM)?

A

o Those species that don’t cause TB or leprosy

o Opportunistic pathogens – cause infections of lungs, soft tissue and bones

24
Q

What is NTM Pulmonary Disease?

A

o Occurs mainly in people that are immune compromised or have underlying lung conditions
o Organisms commonly associated with NTM-PD = M. avium & M. abscessus.

25
Q

M. abscessus

A

o Rapid growth – colonies in <7 days
o Evidence of transmission between patients
o Traditional molecular epidemiological tools:
• Pulse-field get electrophoresis. (PFGE)
• Multi-locus sequence typing (PCR and sequencing of selected genes)
o NGS gives higher resolution because it looks at changes over the entire genome

26
Q

What is Pulse-field Gel Electrophoresis (PFGE)?

A
  • Genomic DNA extracted from strain
  • Digested with restriction endonuclease (Usually one that cuts infrequently)
  • Fragments are too large to be separated by traditional electrophoresis
  • Applies electric current at an angle to the sample, and applies it in pulses
    o Causes DNA to migrate in zigzag manner – increases path that DNA needs to migrate through
  • Relies on bp changes occurring at restriction endonuclease cut sites
    o All strain typing (differentiating strains from each other) relies on accumulation of random mutations within genome over time
    o This method requires those changes to occur at specific endonuclease cut site
  • This means that banding patterns can be used to make phylogenetic trees
27
Q

Study: Transmission of M. abscessus - Results

A
  1. The different subspecies separate on the tree (obvs)
  2. Strains from a single individual often clustered together
    a. Indicates that these individuals most likely acquired strains from their environment and that they were quite genetically diverse
  3. BUT there is also grouping of strains from different individuals
  4. Antibiotic Resistance:
    a. 2 mechanisms of Azithromycin resistance:
    i. Inducible mechanisms – elevated expression of Erm gene (deleted in massiliense subspecies) in response to Azithromycin results in resistance
    ii. Change in bp from A at position 2058 in 23S rRNA gene to a C
    b. Found 3 individuals with Azithromycin resistance who had never been exposed to the antibiotic
    c. Amikacin resistance: G/C bp change in 1408/1409 position of 16S rRNA
  5. Plotted timelines for each individual of where they could have contracted the infection – hospital visits or admissions etc.
  6. Could compare those timelines to see when strains could have been transmitted between patients.
    a. Found 4 patients with no opportunities for transmission, yet their strains had very high similarity
    b. Decided their similarity was not due to transmission but to a dominating circulating strain.
28
Q

Study: Transmission of M. abscessus - Conclusion

A
  • 3 modes of infection by cystic fibrosis patients:
    o Independent acquisition of genetically diverse strains
    o Independent acquisition of dominant strain (related)
    o Transmission
  • Mechanism of transmission is unknown