Week 4 - Bacterial Genomics Flashcards

Question

454 sequencing system

Answer 1

• recent technological advance • generates data 100x faster than Sanger method • 454 relies on 2 major advances - massively parallel liquid handling and pyrosequencing -- light is released each time a base is added to DNA strand -- instrument actually measures releaes of light -- can only handle short stretches of DNA

Answer 2

shotgun sequencing • entire genome is cloned and resultant clones are sequenced • much of the sequencing is redundant • generally 7- to 10-fold coverage * computer algorithms used to look for replicate sequences and assemble them * occasionally assembly isn't possible * closure can be pursued using PCR to target areas of the genome

Answer 3

* closed genome relies on manpower * more expensive * more information

Answer 4

converting raw sequence data into a list of genes present in the genome

Answer 5

an open reading frame that encodes a protein • computer algorithms used to search for ORFs - look up start/stop codons and Shine-Delgaro sequences • ORFs can be compared to ORFs in other genomes

Answer 6

as many as10% of annotated genes are incorrectly annotated

Answer 7

* chain termination method | * best for small DNA segments

Answer 8

* sequence human genome | * fragments larger DNA strand to make manageable chunks

Answer 9

* sequence by synthesis | * accurate and fast

Answer 10

* science that applies powerful computational tools to DNA and protein sequences * for the purpose of analyzing, storing, and accessing the sequences for comparative purposes

Answer 11

• on average a prokaryotic gene is 1,000 bp long - ~ 1,000 genes per megabase (1Mbp = 1,000,000 bp) - as genome size increases, gene content proportionally increases

Answer 12

1995 | • now routine and many hundreds of bacterial genomes have been sequenced

Answer 13

* "environmental genome sequencing" - sequence DNA from an environmental sample, without isolating and culturing strains first * "RNA sequencing" - "deep sequencing" of RNA to reveal the frequency of different RNA molecules

Answer 14

parasitic or endosymbiotic prokaryotes • obligate parasites range from 490kbp (Nanoarchaeum equitans) or 4,400 kbp (Mycobacterium tuberculosis) • endosymbionts can be smaller (eg 160 bp genome of Carsonella ruddii) • estimates suggest that the minimum number of genes fora viable cell is 250-300 genes

Answer 15

from 490 kbp (Nanoarchaeum equitans) | to 4,400 kbp (Mycobacterium tuberculosis)

Answer 16

can be smaller | eg 160 bp genome of Carsonella ruddii

Answer 17

250-300 genes

Answer 18

``` Sorangium cellulosum (bacteria) • largest prokaryotic genome to date is 12.3 Mbp ``` largest archaeal genomes tend to be smaller (~5 Mbp)

Answer 19

an organisms lifestyle

Answer 20

sequence similarity to genes found in other organisms (comparative analysis)

Answer 21

predictions of metabolic pathways and transport systems | • eg Thermotoga maritima

Answer 22

* 4.6 MB | * 4405 genes

Answer 23

* 8.7 MB | * 7825 genes

Answer 24

* 0.58 MB | * 482 genes

Answer 25

* 1.66 MB | * 1738 genes

Answer 26

* 1.67 MB | * 1696 genes

Answer 27

* 6.36 MB | * 6132 genes

Answer 28

stable plasmids - much smaller circular DNA molecules, usually with a few genes

Answer 29

* Mycoplasma genitalium - 0.58 MB * Streptomyces coelicolor - .8 MB * Escheria coli is fairly average - 4.60 MB with circular chromosome about 1.4mm in circumference, 1.44mm long, diameter of 0,45 mm (E. coli cell 4micrometers long)

Answer 30

of its chromosome per cell - or 2 copies when the cell is about to divide

Answer 31

multiple copies of the chromosome • eg cyanobacteria typically have about 10 copies of the chromosome in every cell • eg a Synechocystis cell is about 3 micromenters in diameter and each cell contains DNA with a total length of about 11mm

Answer 32

tightly folded and packed into an irregular structure in the cytoplasm - the nucleoid

Answer 33

* by weight about 60% DNA, 30% RNA, 10% protein * RNA and proteins probably help to fold DNA into a compact structure * with very rare exceptions, no surrounding membrane - in bacteria DNA is freely exposed to the cytoplasm * BUT the nucleoid is usually attached to the plasma membrane at one point

Answer 34

• starts from a single, defied origin • is bidirectional (origin of replication, replication forks (2, theta), 2 new double-stranded circular DNA molecules)

Answer 35

multiple loci along the chromosome

Answer 36

only start at one point

Answer 37

30 minutes for replication to be complete (depending on the genome size) • BUT the mean doubling time for some bacteria is less than this, under optimal conditions - how?

Answer 38

are very unlikely to occur by chance | • such a sequence is known as an ORF and is potentially a sequence coding for a protein (a gene)

Answer 39

* ATG | * GTG

Answer 40

* TAA * TAG * TGA

Answer 41

* control sequences upstream of the ORF promote binding of RNA polymerase * hence transcription to RNA followed by translation of the RNA to make protein * but those control sequences are very hard for us to recognize

Answer 42

``` 5' • regulatory sequences • RNA polymerase binding • leader sequence (RNA - ribosome binding) • Coding region ORF (RNA - coding region ORF) • trailer (RNA - trailer) terminator 3' ```

Answer 43

3186 ORFs predicted in total | • genes can be on either strand of the DNA

Answer 44

1. numbers of genes, relationship to complexity of the organism 2. a possible minimum set of genes - idenify the common minimal set of genes needed for viability? 3. dense packing of genes in bacterial chromosomes 4. organization of genes in operons 5. evolutionary diversity 6. evolutionary relationships 7. large number of unknown genes (40-60%)

Answer 45

rough correspondence between genome size and complexity of lifestyle • Mycoplasma genitalium (0.58 MB, 482 genes) - parasite with very small cells and simple metabolism • Streptomyces coelicolor (8.7 MB, 7825 genes) - soil bacterium with very versatile metabolism, complex structure (branched network of filaments), sporulation • Prochlorococcus marinus and Anabaen cylindrica are both cyanobacteria - Prochlorococcus (1.67 MB, 1696 genes) - has small, simple cells - Anabaena (6.37 MB, 6132 genes) - filamentous, multiple cell types

Answer 46

parasite with very small cells and simple metabolism

Answer 47

soil bacterium with very versatile metabolism, complex structure (branched network of filaments), sporulation

Answer 48

cyanobacteria - Prochlorococcus (1.67 MB, 1696 genes) - has small, simple cells - Anabaena (6.37 MB, 6132 genes) - filamentous, multiple cell types

Answer 49

• Craig Venter's plan to further strip down the genome of Mycoplasma genitalium to create a minimum living organism of about 300 genes

Answer 50

bacteria typically about 1 gene per 1,100 bases in H. sapiens about 1 gene per 30,000 bases • bacteria have dense clustering of genes - very different from eukaryotes

Answer 51

1,100 bases

Answer 52

30,000 bases

Answer 53

* clusters of genes on the same DNA strand with related functions likely to be operons * genes are co-transcribed (ie 1 mRNA molecule for the whole operon)

Answer 54

probably 2 reasons a. bacterial metabolic diversity - different bacterial species may have fundamentally different metabolism, hence the need for quite different sets of genes b. deep evolutionary roots - bacteria have been on the planet much longer than other life forms - hence greater time for evolutionary divergence

Answer 55

comparing related species of pathogenic bacteria - can we track pathogen evolution, and can we identify specific genes that are important for specific pathogenecities? • Mycobacterium bovis - bovine tuberculosis - 3952 genes • Mycobacterium tuberculosis - human tuberculosis - 4238 genes • Mycobacterium leprae - leprosy - 2768 genes • classical microbiology shows different host range, virulence, and physiology - but what is the genetic basis of the differences? and how are the 2 species related? did M. bovis jump the species barrier from cattle to humans when cattle were domesticated 10,000 - 15,000 years ago?

Answer 56

* very closely related >99.95% sequence identity * nearly all ORFs are conserved, and are in the same order on the chromosome - no rearrangements * therefore recent divergence * but M. bovis has a slightly smaller genome, and a series of deletions resulting in about 300 fewer genes. It looks as though M. tuberculosis is closer to the common ancestor - did cows catch TB from us?

Answer 57

so one of the main lessons from genome sequencing is how much we don't known about bacterial biology

Answer 58

* genetic information is stored int he order or sequence of nucleotides in DNA * chain termination sequencing is the standard method for the determination of nucleotide sequence * dideoxy-chain termination sequencing has been facilitated by the development of cycle sequencing and the use of fluorescent dye detection * alternative methods are used for special applications, such as pyrosequencing (for resequencing and polymorphism detection) or bisulfite sequencing (to analyze methylated DNA)

Week 4 - Bacterial Genomics Flashcards

(83 cards)