Lecture 11- Bacterial genomics I - Features of bacterial genomes Flashcards
2 approaches to organising short read data
mapping, involving aligning reads to a reference genome
de novo assembly- no reference genome
pros and cons of mapping
rapid and easy to visualise, reproducible
requires a reference genome which may not exist, can’t identify genomic events such as translocations and rearrangements
pros and cons of de novo assembly
good for novel sequences and identifying large genomic changes
struggles to resolve repetitive regions, time consuming, often misses parts of the genome
phred score
quality score for each base, which can be used to help analyse SNPs
how does de novo assembly work (2 methods)
‘overlap-layout consensus’- all overlaps between reads are determined, laid out and a consensus sequence is determined
alternatively de brujin graph method- shorter k-mer fragments are generated, and the best links between these reads are determined to assemble a genome
example of long-read sequencing technologies
minION (oxford nanopore)
important things to note in gene annotation
location, feature type, attributes- e.g. location of protein, enzyme code
approximate size of bacterial genomes
2-5mb
what is in the ‘core genome’ of bacteria
DNA replication, cell envelope, some metabolic genes
example of accessory genes
alternative metabolic pathways, transport systems. antibiotic resistance genes
what is a pangenome
set of genes shared by all members of a group