The minimal genome Flashcards

1
Q

What is the concept of the minimal genome?

A

A hypothetical minimal auto-replicative system

  • Aims to strip down a present day bacterium to its minimum essential components pertaining to replication, transcription and translation machinery.
  • Understand the basic components of the cell that makes it living.
  • Provides a template genome that can be used to recreate life
  • A less complex cell that can be reliably modeled and engineered to meet our requirements (aka Synthetic Biology, Synthetic Genomics).
  • until recently the search for the minimal gene set was the domain of computer- aided comparative genomics
  • the creation of an artificial cell with a minimal genome is far from being straightforward (even though of G. Venter’s “synthetic cell” approach)
  • gene sets of completely sequenced organisms is:
    • ~ 500 to 10‘000 genes (Prokaryotes)
    • ~ 2’000 to 30’000 genes (Eukaryotes)
  • the smallest genome of a free-living organism is the bacteria Pelagibacter ubique with 1’354 protein-coding genes & 35 ncRNA-coding genes (1.3 Mb)

→ minimal gene set that is needed for autonomous cellular life is surprisingly small
(parasitic & symbiotic bacteria have even smaller genomes!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the classical view on genome evolution in prokaryotes and how is it more realistically?

A

Classical view on genome evolution:
point mutations accumulate slowly

Bacterial genomes are much more dynamic and can change rapidly:

  • rapid loss of large DNA fragments (via homologous recombination)
  • integration of horizontally acquired DNA (lateral gene transfer)
  • genome rearrangements (transposons, insertion segments, inversions):
    • e.g. Inversion via homologous recombination
  • Gene duplication:
    • e.g. due to “illegitimate crossing over” during replication
  • etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is it not enough to sequence only a single strain of a species in prokaryotes?

A
  • Bacterial strains belonging to the same species vary considerably in gene content (e.g. different E. coli strains).
  • thus sequencing a single strain of a species only reveals an incomplete picture of the species gene content
  • The genetic repertoire of a given species (its “pan-genome”) is much larger than the gene content of individual strains
  • Example: E. coli
    • Pan-genome: 10’131 - 20’000 genes
    • Average E. coli genome: 4’721 genes
    • Core genome: 2’167 genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a pan-genome?

A

The full complement of genes found in all strains of a species

  • the pan-genome increases the more sequenced genomes are considered
  • codes usually for functions of the cell surface, signal transduction, or pathogenicity
    → thus for functions crucial to conquer rapidly changing environments or niches
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the core-genome?

A

The gene set found in all genomes of a species

  • the more divisions/species/strains are compared, the smaller the “core genome” gets
  • these genes are more „stable“ and code for translation and amino acid synthesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are genomic islands?

A
  • relatively large DNA regions (10 – >100 kb) can integrate into genomes (e.g.: pathogenicity islands)
  • by that several genes get integrated at once (via HGT)
  • often associated with integrase genes & they have an unusual nucleotide composition (such as GC content or codon usage → hallmarks for foreign origin)

→ primary strategy for gene acquisition in prokaryotes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are Borgs?

A
  • new type of archaeal extrachromosomal element discovered in a metagenomics study
  • extraordinarily large (up to 1 Mbp) DNA
  • linear extrachromosomal DNA that carry repetitive sequences at the ends and throughout
  • their genes were assimilated from methane-oxidizing Methanoperedens archaea
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a bacterial IS elemnt?

A

Insertion Sequence elements

  • Central region encodes 1 or 2 enzymes required for transposition (transposase)
  • It is flanked by inverted repeats of characteristic sequence
  • The 5’ and 3’ short direct repeats are generated at the target-site DNA during the insertion
  • Transposition is catalyzed by one or both of the flanking IS-encoded transposases
  • Example: Sulfolobus sp.
    • have enormous single-gene- translocation rate
    • → Sulfolobus known for its abundant transposons (Insertion Sequence elements)
    • ~12% of the 3 Mb genome are IS-elements
  • a correlation exists between the number of repetitive sequences (e.g. IS-elements) and the rate of single gene translocation
  • thus transposons and other repetitive elements are hot-spots for genomic rearrangements
    → play a key role in genome evolution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Pelagibacter ubique?

A

One of the most abundant microbes in the ocean

  • 0.37 – 0.89 μm long & 0.12 – 0.20 μm in diameter
  • 30% of the cell’s volume is taken up by its genome
  • the smallest genome (1.3 Mb) of any free living organism
  • no duplicated gene copies, no viral genes, and little ncDNA
  • genome has been streamlined
  • low GC content of only 30%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Mycoplasma genitalium?

A
  • 0.58 Mb genome
  • 470 protein-coding genes
  • intracellular parasite in humans
  • smallest known genome of an organism capable of independent growth
  • lacks a cell wall
  • Mycoplasma sp. originate from a Gram positive ancestor via reductive evolution
  • evolved by massive genome reduction
  • obligate parasites
  • lack genomic redundancy

→ Appears to be ideally suited to look for essential genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Nanoarchaeum equitans?

A
  • 0.49 Mb genome; 522 protein-coding genes
  • exists only in co-culture with Ignicoccus sp. (parasite or symbiont?)
  • only 400 nm diameter (smaller than some viruses)
  • so far the only representative of the phylum Nanoarchaeota
  • lacks genes for the synthesis of amino acids, nucleotides and lipids
  • carries ‘split genes‘: → ancestral gene structure of multi domain proteins/RNAs?
  • living fossil
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can comparative genomics help with finding the minimal genome?

A
  • Exaple:
    • comparative genomics: E. coli uses 243 genes for energy metabolism, while H. influenza only 112, and M. genitalium only 31
    • the parasitic M. genitalium can even eliminate essentially all genes of amino acid synthesis pathways
  • genes for translation, however, cannot be reduced
  • also protein folding genes are resistant to elimination
  • comparative genomics: under optimal growth conditions (all essential nutrients present, no stress, no competition) the “minimal genome” consists of ~ 256 genes

→ In fact it is more appropriate to speak of a minimal set of essential functional niches rather than of minimal sets of genes.

  • comparison of ~100 genomes revealed that only ~63 genes (!) were not replaced by NOGD and were thus present in all of the investigated genomes
  • > 50 thereof are components of translation machinery
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Non-orthologous gene displacement?

A

Displacement of a gene responsible for a particular biological function in a certain set of species by a non-orthologous (unrelated) gene in a different set of species.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What experimental evidence is there regarding the minimal genome?

A
  • Genome-wide analyses of gene knock-outs via transposon insertion, plasmid insertion or antisense RNA
  • in these studies > 50% of the genes of the respective organism have been studied
  • the experimental approaches revealed a surprisingly similar number for the minimal gene set as comparative genomics did, namely ~300-350 genes
  • reason why H. infuenzae and E.coli require more genes unclear; maybe Gram neg. need more genes to build their cell wall, and for transport through it
  • Used species: M. genitalium/M. pneumoniae, B. subtilis, H. influenzae, E. coli, S. cerevisiae, C. elegans
  • not only the number, also the function of these „minimal gene set“ genes were similar between the experimental (right) and computational (left) studies
  • minimal gene set: many genes for information-processing (especially translation), only very few genes with unknown functions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are problems when defining the minimal gene set?

A

Essential Genes – A conclusive list??

  • Different studies came up with a different number of essential genes.
  • Computation: Underestimates minimal genes - accounts mainly those genes that have been conserved in evolution.
  • Transposon mutagenesis: Overestimates the genes – Classifies genes that slow down growth as essential and essential genes that tolerate mutation as non-essential.
  • Most mutants produced are single mutants – synthetic lethality largely not considered

Construction of a single cell with systematic combination of all the mutations in a single strain is beyond the scope of present day technology

What is an essential gene?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are problems with defining an essential gene?

A
  • In these studies random or directed single gene inactivation was performed
  • But: simultaneous elimination of two individually non-essential genes may be lethal (aka as ‘synthetic lethality’)
  • Conversely, some genes may be individually essential but, in combination with a second disruption, the first becomes tolerated (e.g. toxin–antitoxin genes)
  • environmental context of experiment affects the outcome → many ‘minimal cells’ with different ‘minimal genome’ versions.
17
Q

How are the minimal genome and the last universal common ancestor of life (LUCA) related?

A
  • comparative genomics can also contribute to questions of the evolution and LUCA
  • LUCA likely was not a single species but a community of organisms that rapidly exchanged genetic material –> the more primitive an organism is, the more unstable is its genome
  • minimal gene set, or better ‘minimal set of essential functional niches’ can be used to define the gene-set of LUCA
    (problem: rates of HGT and gene loss unknown)
  • it was examined if and how the functional niches in various phylogenetic tree variants are filled
    → LUCA had a simple genome (~600 genes), thus smaller than any now free living bacteria
  • interesting detail: LUCA had no DNA polymerase and no DNA helicase → evidence that life did not start with DNA genomes but probably with RNA genomes → RNA world theory
  • no ‘Tree of Life‘ but likely more a ‘Shrub of Life‘ or ‘Ring of Life‘
18
Q

What is Buchnera aphidicola?

A
  • a primary endosymbiont of the aphid Cinara cedri
  • B. aphidicola has a circular genome of only 416’380 bp (plus a plasmid of 6’045 bp)
  • thus genome is smaller than some plant mitochondrial genomes (< 600 kb)
  • bacteria live inside special host cells (Bacteriocytes)
  • dramatic genome reduction → ~ 75% of its genome was lost during evolution as endosymbiont
  • has only 362 genes (35% for translation & transcription)
  • lost many biosynthesis genes (nt, cofactors, vitamins) → depends on host
  • aphid needs 10 essential amino acids from symbiont
  • but: B. aphidicola has lost Trp biosynthesis genes!?
    • Trp is synthesized by a secondary symbiont Serratia symbiotica
  • B. aphidicola’s massive genome degradation might lead to its replacement by the secondary symbiont in the near future
19
Q

What is Carsonella rudii?

A
  • endosymbiont in aphids
  • belongs to γ-proteobacteria
  • also lives in bacteriocytes (like organelles C. rudii cells get vertically & maternally transmitted across host generations)
  • has a circular genome of only 159’662 bp
  • has only 182 ORFs (35% Translation; 18% aa metabolisms)
  • thus its genome is only about one third that of the archaeal parasite Nanoarchaeum equitans
  • C. rudii has an unusually low GC content of 16.5 %
  • host needs amino acids from symbiont
  • C. rudii has a very densely packed genome (97.3 % of genome are coding; 90% of the genes are overlapping)
  • The genome also lacks many genes for bacterium-specific processes.

→ have they been transferred to the host genome?
→ is C. rudii on its way to become a true organelle?

20
Q

What is Nasuia deltocephalinicola?

A
  • β-proteobacteria
  • endosymbiont of Macrosteles quadrilineatus (a leafhopper)
  • smallest bacterial genome yet sequenced 112 kb and 137 protein genes
  • UGA has been reassigned to Trp codon (occurs very rarely in evolution)
  • massive form of genome erosion
21
Q

What is Genome erosion in symbiotic bacteria & relatives?

A
  • tiniest genomes evolve exclusively in maternally inherited symbionts that are obligate mutualists and that have co-diversified with hosts over millions of years
  • discovery of these extreme genomes challenges the premise of the minimal genome concept, that there is a lower limit of the genome size of a cellular organism
  • tiny genomes show ongoing gene loss, no gene uptake, and an almost complete absence of gene rearrangements
22
Q

What genes are lost?

A
  • Genes are lost in all functional categories
  • core genes for central informational processes (replication, transcription, translation) are mostly retained (e.g. ribosomal proteins)
  • cell envelope component genes are especially depleted
  • DNA repair genes are one of the most depleted functional categories
  • Despite this trend, losses of seemingly essential genes have occurred, raising the question of how replication and growth are possible (e.g. oxphos is sometimes missing)

Remarkably: some host genes critical to bacteriocyte function were themselves horizontally transferred into the host genome from bacterial donors!!

23
Q

At which stage should we call an endosymbiont an organelle?

A
  • Protein transport from nuclear-encoded protein into organelles is well known → hallmark for true endosymbiosis
  • but no evolutionary intermediate stages of endosymbiosis known (until this report):
    • RlpA is a lipoprotein that was acquired by the aphid via HGT (from an unrelated bacteria)
    • expressed solely in the aphid’s maternal Bacteriocytes (mb)
    • RlpA imported into the cytoplasm of Buchnera endsymbiont!!
  • Genetically integrated photosynthetic organelles evolved twice via the endosymbiotic uptake of a cyano-bacterium
  • After stable establishment in cytoplasm of host: genome reduction (gene loss & gene transfer to the nucleus via EGT)
  • import of nuclear-encoded proteins of various phylogenetic origins (EGT & HGT) compensated for essential genes that were lost from the endosymbiont

Is protein import of nuclear-encoded proteins then the decisive event?