Features of bacteria genomes and analyses Flashcards

1
Q

Features of bacterial genomes

A
  • Haploid single chromosome
  • Diverse range of sizes – lifestyle dependent
  • Genomes highly compact – few pseudogenes / non-coding regions
  • Highly structured
  • Large pan genome – extensive mobilome
  • Variability in GC content
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

History of genome sequencing

A

First generation sequencing: Shotgun Sanger sequencing
-> produces complete, assembled genomes with annotations

Parallel next generation sequencing: Illumina (short reads), PacBio (long reads)
-> Tends to produce short reads that are pieced together

Following invention of second gen price of sequencing dropped dramatically allowing growth of the human genome project.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

short read sequencing

A

Analysis of short sequence data:

1) Mapping
- Comparison of short reads to a reference

2) Assembly
- De novo assembly and comparison
- Reference free assembly using k-mers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Genome annotation

A

Once the genome is assembled using short and long read sequencing..

Genome annotation:
a) Location eg. which strand
b) Feature type eg. protein coding, codes tRNA, stop codon etc
c) Attributes eg. product produced, enzyme?, location of product in membrane?

Software to do this
- Prokka
- pubMLST
- EggNOG -> look at evo history of gene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bacterial genome Features

A

1) Bacteria have an open pangenome made up of the core genome and accesory genome (HGT of accesory genome)

Bacterial genome content:
 Core genome eg. DNA replication
 Accessory genome eg. alternative metabolic routes – bring fitness advantages to strain but not in all strains
 Mobile elements
 Parasitic elements eg. toxins

(Mobile elements and pathogens are part of the accessory genome)

2) High GC variability across bacteria
- GC more stable than AC?

3) Large scale re- arrangement in bacteria (inversion, translocation, genetic islands)
- Gene found in different regions between individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Comparing the genome of different types of bacteria

A

The size and features of bacterial genomes depend on their biology.

Free living:
- large genome/ pangenome
- stable structure (few pseudogenes / TEs, frequent HGT)
- eg. soil bacteria

Facultative / recent pathogen:
- smaller genome/ pangenome
- Many pseudogenes / TEs / repeats, many selfish genetic elements, unstable structure
- eg. Neisseria,
Streptococcus

Obligate symbionts:
- v small, few genes,
- no pseudogenes / transposons, but stable,
- rare HGT
- eg. Buchnera / Chlamydia

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Short read assembly: mapping

A

Reads are aligned to a reference genome using mapping software leading to a ‘pile up’.

Variants are called

Advatages:
- Rapid
- Accurate
- comparable and reproducable

Disadvatages:
- requires high quality reference genome
- mapping cannot identify genes not in the reference
- repeating regions are problematic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Short read assembly: De novo assembly

A

‘K-mer’ approach: reference-free assembly and comparison, independent of biological information

1) Overlap- layout method
- All of the overlaps between reads are determined then reads and overlaps are all laid out in a graph and consensus sequence is identifies

2) De Brujin method
- Reads are broken into shorter fragments called k-mers followed by construction of a de bruijn graph where overlapping k-mers are connected by edges.

Advantages:
- Reference free
- New genes can be identified
- used to indentify large genomic sequence variants

Disadvatages:
- struggles to resolve repetitive regions
- expensive and time consuming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Limitations to short read sequencing

A

Struggles to map:

o Low complexity/ repeat regions where the fragment is smaller than the gap

o Intermittent identical repeats

Solution
- Use a combination of long and short read sequencing (e.g. PacBio)
- Hybrid assembly combines the base calling accuracy of short-read sequencing with the scaffolding power of long reads to solve genomic features that are unresolvable by short reads alone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does sequencing reveal about bacteria?

A
  • Single genome clearly inadequate to describe a species due to the extensive pan genome of bacteria with individuals from the same species having varying accessory genomes.
  • Multiple strains must be sequenced for numerous bacteria.
  • Degree of HGT varies-> Some pathogens are monomorphic (very clonal with little genetic variation between strains) while most are not.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Overview

A

The size and features of bacterial genomes depend on their lifestyle (free living, faculative, obligate) but certain key features remain (compact, haploid, highly structured, large rearrangements)

They have a large pangenome making describing species hard (must sequence multiple strains from multiple bacteria).

Types of sequencing:

Short read:
- Lots of short reads are pieced together to produce the final genome
- >Mapping
- >Assembly
—–> Overlap layout consensus
—–>De Bruijn graph method

Long read:
- They span the entire length of low complexity regions
- e.g. Pac Bio

Both methods have advatages and disadvatages

Hybrid method:
- Hybrid assembly combines the base calling accuracy of short-read sequencing with the scaffolding power of long reads to solve genomic features that are unresolvable by short reads alone

Annotation:
- Once a genome is assembled it must be annotated to be understood
- Location, feature type, attritbutes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly