Lecture 12 - Microbial identification and genomics Flashcards

1
Q

Who were the first people to categorize life?

A

Aristotle categorized life into two fundamental
groups, Animals and Plants

In 1868 Ernst Haeckel proposed a third group,
Protista, to classify all microscopic life-forms
* later, Protista was subdivided into eukaryotic
microorganisms and bacteria

All identification was based on physiological
differences up until here

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Classic taxonomy refers to what type of differences?

A

Classification based on physiological
differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Classic Taxonomy: What are physical differences?

A
  • cell shape
  • structure of cell envelope (Gram stain, etc.)
  • flagella / motility
  • endospore formation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Classic Taxonomy: What are metabolic differences?

A
  • Metabolic differences
  • ability to metabolize various metabolites
    such as carbohydrates, amino acids, lipids
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name 3 examples of Classic taxonomy

A
  1. Glucose catabolism
  2. Blood again and hemolysis
  3. Phenotype microarray
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is classification based on in Modern molecular taxonomy?

A

Classification based on direct comparison of gene sequences
* not all genes are suitable for taxonomy, though

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 characteristics of an ideal gene for molecular taxonomy

A
  • gene is present in all organisms
  • gene’s DNA sequence is very well conserved across all organisms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens if a gene is missing in certain organisms?

A
  • If a gene is missing in certain organisms, that gene can not be used to construct a full phylogenetic tree for all species
  • can’t use a heterocyst-specific gene to elucidate evolutionary relationship between a cyanobacteria
    and a respiratory pathogen
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens if a gene contains too many mutation?

A

If a gene contains too many mutations, that gene can may not be useful to construct accurate phylogeny as the information contains too much noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What did Carl Woese first use for molecular taxonomy?

A

Carl Woese first used the small subunit
ribosomal rRNAs (SSU rRNA) for molecular
taxonomy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is SSU rRNA?

A

SSU rRNA = small subunit
ribosomal rRNAs

  • major component of small ribosomal subunit
  • critical function for all forms of life
  • 16S rRNA for bacteria
  • 18S rRNA for eukaryotes
  • coded by the ‘rDNA’ genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Because of its crucial function, SSU rRNA is:

A
  • universally present in all cellular organisms
  • very well conserved between organisms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SSU rRNA have two types of regions:

A
  • conserved regions are extremely well
    conserved between different organisms
  • variable regions show more differences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which region in SSU rRNA is used for taxonomic analysis and why?

A

Variable regions is used for taxonomic
analysis
* has enough difference to classify organisms at the Genus and species level

Conserved regions are too similar for
taxonomy
* used to design universal primers which can anneal and to amplify SSU rDNA genes from many organisms within the same domain of life

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What creates the Tree of Life and for whom?

A

SSU rRNA can create a comprehensive ‘Tree of Life’ for
organisms with ribosomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Do viruses have any universal genes which can be used for comparison?

A

Viruses do not have any universal genes which can be used for comparison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When is viral classification done?

A

Viral classification is frequently done using genes commonly found within the same Baltimore Class
* reverse transcriptase for retroviruses
* RdRp for RNA-viruses
* capsid proteins within the same class, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In the Tree of life, where is CpV BQ1 and CpV BQ2?

A

Phylogeny of CpV BQ1 and BQ2 using the DNA polymerase (polB) gene of mimiviruses and relatives

  • BQ2 is a closer relative to Mimiviridae whereas BQ1 belongs to a related family, Phycodnaviridae
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is Genomics?

A

determination and study of
complete genome sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Multiple uses of genomics? (Hint: link, compare, generate)

A
  • link genetic characteristics of individual microbes with their physiological properties and ecological roles
  • compare genome sequences between related species and strains of organisms to uncover basis of pathogenicity etc.
  • generate hypothesis from the genome sequence and then confirm it experimentally
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What bacteria was mistaken as the causative agent of influenza?

A

Haemophilus influenzae

  • first bacteria for its genome to be completely read
  • Gram-negative coccobacillus
  • opportunistic pathogen in human respiratory tract
  • was mistaken as the causative agent of
    ‘influenza’ during early days of microbiology
  • still inherits its name from the disease as an historical artefact
22
Q

What does TIGR stand for?

A

The Institute for Genomic Research

23
Q

What was Haemophilus influenzae read using?

A

Sanger and Shotgun sequencing

24
Q

Explain Sanger sequencing

A
  • Sanger sequencing: the name of the DNA
    sequencing method which uses dideoxy-NTPs to prematurely stop a DNA polymerization reaction
25
Q

Explain Shotgun sequencing

A
  • Shotgun sequencing: a genome-sequencing approach which reads the genome in small, fragmented pieces and then re-assembles the pieces afterwards
26
Q

What are the 3 steps of Shotgun sequencing using Sanger?

A
  1. Generate DNA Library from a genome
    * fragment the genome by sonication etc.
    and clone fragment into plasmids
    * entire genome is represented in multiple
    fragments cloned in individual plasmids
  2. Use Sanger sequencing to sequence
    the individual fragments in the library
    * numerous short sequences (~600 bp long) are generated, each representing a tiniy portion of the genome
  3. Assemble the short sequences into
    one piece based on overlapping
    regions
27
Q

Explain high throughput sequencing

A

High throughput sequencing allows a mixture of multiple different fragments to be read in a single reaction, simultaneously
* Illumina
* Ion-torrent, etc.

The methods above typically produces a huge number of accurate, short reads (200 – 700 bases)
* requires huge computing power to assemble these small fragments

Made genomics extremely affordable

28
Q

What is a limitation of Sanger?

A

Sanger sequencing is limited to reading one sample of DNA (one plasmid, etc.) in a single reaction

29
Q

Explain Nanopore sequencing

A

Sequencing by ‘detecting shape’ of the
nucleic acid bases

A tiny channel (nanopore) is set up in a lipid bilayer with an electrical circuit connected
* ssDNA is passed through the channel
* each DNA base causes different electrical
fluctuations while passing through the channel
* these different electrical fluctuations are used to determine the sequence of ssDNA

Can read very long piece of DNA at once,
but is not as accurate
* 10000 – 100000+ bases

30
Q

What was originally used as the Nanopore?

A

A Staphylococcus aureus toxin was originally used as the ‘nanopore’

31
Q

what is alpha- hemolysin? Name the diameter too.

A
  • ɑ-hemolysin (ɑ-HL)
  • pore-forming protein which inserts itself into host’s lipid membrane to disrupt the permeability barrier
  • minimum diameter is ~1.4 nm
  • allows ssDNA or ssRNA to pass through, but not dsDNA
    (dsDNA has about 2 nm diameter)
32
Q

What are the 3 steps to make the nano pore?

A
  • express a purified solution of ɑ-HL
  • insert ɑ-HL into a lipid bilayer
  • connect each side of the lipid bilayer to an anode and cathode to produce electric field
33
Q

What does bioinformatics and annotation of genes refer to?

A

Genomic sequencing produces a lot of sequences which needs to be analyzed

34
Q

Explain the Identification of potential protein CDS in the genome

A
  1. translate the entire genomic sequence
  2. look for instances where a sequence gets translated into a long polypeptide (50 - 100+ amino acids) without getting interrupted by a stop codon
35
Q

How do you Assign putative function to potential protein CDS?

A
  • BLAST search the sequence against existing proteins in the database
  • many putative CDS will have no assignable function
36
Q

What is comparative genomics?

A

studies relationships between different species
* insight into phylogeny of all life and diversity
* also used to investigate differences at a ‘smaller’ scale, comparing pathogens to their non-pathogenic
relatives, etc.

37
Q

What is a Pan genome?

A

is the set of genes found in all related strains (variants) of a specific
organism

38
Q

Is there a difference in genome size even within the same species of an organism?

A

Yes it is huge!
* For example, there can be over 30 percent difference in genome size between different strains of E.
coli
* All of these strains still have the ‘E. coli pan-genome’
* The remaining differences in these strains represent the different physiology of each E. coli strain

39
Q

What is a very virulent strain of E. coli that is a major cause of food poisoning? Tell me about the size

A

E.Coli O-157:H7

O-157 genome is about 15 % larger
compared to K12
* large segments of genome existed in one strain but not the other
* two strains has common ancestor about 4.5 million years ago
* 4.1 Mb of DNA contain genes which are similar between these strains

40
Q

What is a non-pathogenic lab-strain of E.Coli?

A

E. coli K12

41
Q

What are O -islands?

A
  • unique segments of DNA only found in O-157
  • 177 O-islands in total, 1.34 Mb DNA
42
Q

What are K-islands?

A
  • unique segments of DNA only found in K12
  • 234 K-islands, 0.53 Mb DNA
43
Q

Name the 5 virulence associated genes found in O islands

A
  • intimin (adhesion to intestine)
  • type III secretion system (to secrete toxin?)
  • iron uptake
  • toxins, including the Shiga toxin
  • antibiotic resistance
44
Q

What can comparative genomics predict? and what can it be used to generate?

A
  • Comparative genomics can predict these
    potential virulence-associated factors before those genes were investigated in a wet lab
  • Comparative genomics can be used to
    generate hypothesis to drive experiments
45
Q

What is responsible for new genetic capability arising in a genome?

A

Homologs

46
Q

What homologs?

A

genes which share common ancestor

47
Q

Explain how some homologs arose from gene duplication events

A

Some homologs have arisen from gene
duplication events within the same genome
* two copies of the gene exist in the genome afterduplication
* one of these copies are free to evolve into a new function

48
Q

What are paralogs?

A

Homologs which arise from duplication
events

  • paralogs within the same genome often would have evolved to perform related by different functions
  • for example, an organism may have various different ABC-transporters to uptake different nutrients
  • these ABC-transporters are evolutionary related
    variants of each other, sharing a common ancestor
49
Q

What are orthologs?

A

homologs found in different
organisms which perform the same function

50
Q

Why is metagenomics needed?

A

Cultivation of the organism is mandatory in classic molecular biology
* however, many organisms are not cultivatable

51
Q

What is metagenomics?

A

DNA is extracted directly from microbial communities and analyzed as a mixture
* Use of next-generation sequencing allows scientists to analyze a mixed-sample of microbes without
isolation

Produce data on uncultivatable organisms in various environment
* marine
* soil
* human gut

Investigation of diversity using various genes such as SSU rRNA