Lecture 10 Flashcards
genome construction
“shotgun” approach
3 steps: fragmentation, sequencing, and assembly
fragmentation
cut the genome into small pieces
physical shearing and enzymatic methods
physical shearing
nebulization (DNA shreds to fit into a size-adjustable whole & most commonly used method)
sonication (soundwaves that fracture back bone of DNA, won’t lose any DNA)
Enzymatic methods
a chemical method
nuclease mixes (Fragmentase)
cuts DNA
DNA sequencing
determining nucleotide composition of DNA
first, second, and third generations
first generation sequencing
developed by Dr. Fred Sanger in 1977
PCR with Dideoxynucleotides (replaces OH with H– terminates DNA replication)
manual: four separate reactions one for each dideoxynucleotide & gel electrophoresis
automated: single reaction, uses fluorescent markers, and laser detection of sequence
second generation sequencing
Adv: massively parallel (different sequences run together) and high throughput (100 times faster & 33 times cheaper)
(Still) requires amplification of DNA samples
Methods: emulsion PCR and bridge PCR
bridge PCR
still use today
a flat piece of glass and bits of DNA attached to glass (called flowcell)
Illumina platform sequencing (get fluorescent signal where a nucleotide is added, read by computer, and rapid advancement)
third generation sequencing
adv: massively parallel (many sequences at the same time), extremely high throughput (“real-time” results), and does NOT require amplification of DNA samples (can sequence DNA immediately)
methods: SMRT technology (DNA polymerase) and nanopore technology (transmembrane proteins)
Assembly
reconstructing genomes
combine short, overlapping DNA sequences
computer aided sorting
Types of assembly
reference alignment
- comparison to known genome
- must be a closely related organism
De Novo assembly
- novel genome construction
- no closely relative required
De Novo assembly
how many times the genome is sequenced
- number of sequence reads per nucleotide
completion
- how much of the genome is sequenced
- closed (complete sequence, no gaps) v. drafted (incomplete sequence, gaps)
coverage
- how many times genome is sequenced (# of reads per nucleotide)
bioinformatics
analyzing and storing DNA/protein sequences
- utilizes powerful computational tools
comparative analyses
- genome size, content, and organization
bottleneck
Small genomes
140,000 to 1,000,000 bp
170 to 1,000 ORFs
- endosymbionts
- parasites
Large genomes
5,000,000 to 13,000,000 bp
5,000 to 9,000 ORFs
free-living organisms
- unpredictable environment