microbial genomics Flashcards
what is human genome projecT?
1990 - goal to sequence al 3*10^9 base pairs of human DNA
- completed in 2003 - thought it would take 30 year.s
who did HGP?
public: US govt, Francis Collin
private: The institute Craig venter
first completed genome sequence was?
haemophilus influenzae
- respiratory disease. took 8 years thru private funding
venter style?
- break dna into pieces, sequence all at same time. computer sees where overlap + sticks together where overlaps.
= contigs : overlap
= less accurate bc gaps btw regions + repetitive sequences.
but cheap + fast
collins style?
simplistic approach
- sequence DNA. find end of gene. create probe that extended from end of piece to start of next piece
slow, systematic. had to create probe between each finding.
sequencing DNA
sequence thousands of clones at one. 7-10 times coverage of entire genome
- assembles contigs using computer algorithms
- fill in gaps using targeted methods - chromosomal walking
what’s sanger sequencing?
DNA polymerase in soln copies after primer.
di-deoxynucleotide (no -OH) addd to soln. DNApol can’t copy if no -OH.
separate based on size. know based on dideoxy what original nucleotide is.
shorter strands at beginning of sequence
next gen sequencing
- increase in amount of genome/metagenome sequence. (many sequences at a time)
- non-specialist labs able to use (signal when nucleotides incorporated; quick, cheap)
- useful for resequencing (comparative - looking for disease)
- shotgun sequencing + computer power.
structure of ORF
open reading frame
- approx 300 bp before stop codon.
ORF =/= gene. may be, but doesnt need to be. if transcribed + translated = gene.
using ORF
- comp finds codons
- computer finds possible stop codons
- combuter counts codons between start + stop.
- computer finds possible RBS
- computer calculares codon bias in ORF
- computer decides if likely to be genuine
- comp gives list of probably ORF
ORF content to genome size
greater genome size = more ORF content
lifestyles of bacteria
endosymbiotic - live in other cell. use host DNA, but have their own
parasitic: may grow inside cell. can’t grow without host DNA
Free-living: independent. dont need other organisms. usually larger than others. fitness cost
gene annotation
compare ORF, if similar sequence annotated as similar function.
- gene annotations help reconstruct metabolic pathways + determine gene complement
problem with gene annotation?
2 sequences may have diff function
- if dont have sequence similar, maybe protein is still same function
pathogens + growth factors
from host cell.
no genes for amino acid biosynthesis
- no genes
what is URF
ORF with unknown function
Re-constructed Genome map
ORF’s in opposite directions.