Chapter 6- Genomics Flashcards
Genome
Entire complement of genetic information
- includes genes, regulatory sequences, and noncoding DNA
Genomics
Discipline of mapping, sequencing, analyzing and comparing genomes
Sequencing
determining the precise order of nucleotides in a DNA or RNA molecule
generation
refers to successive major changes in sequrncing technology that confer:
- significant increases in speed
- drop in the cost of sequencing
Sanger Dideoxy Method
- first generation sequencing
- Dideoxy analogs of dNTPs used in conjuction with dNTPs
- Analog prevents further extension of DNA chain
- Bases are labeled with radioactivity
- Gel electrophoresis is them performed on products
Second Generation DNA sequencing
- DNA is broken into small segments
- DNA is amplified using PCR
- Light is released each time a base is added to DNA strand
- instrument actually measured release of light
- can handle only short stretched of DNA
- 454 sequencing system
Shotgun Sequencing
Entire genome is clines and resultant clines are sequenced
- much of the sequencing is redundant (to reduce/catch errors)
- generally 7 - to 10-fold coverage of genome
- computer algorithms are used to look for replicate sequences and assemble them
Genome assembly
consists of connecting the DNA fragments in the correct order and eliminating overlaps
(usually done by computer)
closed genome vs draft genome
closed= entire gene sequence obtained draft= some small gaps
Annotation
converting raw sequence data into a list of genes present in the genome
Bioinformatics
science that applies powerful computational tools to DNA and protein sequences for the purposes of analyzing, storing, and accessing the sequences for comparative purposes
Functional ORF
an open reading frame that encodes a protein
Hypothetical proteins
uncharacterized ORF’s; proteins that likely exist but whose function is currently unknown; likely encode nonessential genes
Noncoding RNA
RNA that does not code for protein; lack star codons and have multiple stop codons
Steps to finding probable ORFs
- computer finds possible start codons
- Computer finds possible stop codons
- Computer counts codons between start and stop
- Computer finds possible RBS
- Computer calculates codon bias in ORF
- Computer decides if ORF is likely to be genuine
- List of probable ORF
As genome size increases, gene content…
proportionally increases
eukaryotic genomes conatain
a large fraction of noncoding DNA
Smallest genomes belong to
parasitic or endosymbiotic prokaryotes
Minimum number of genes for a viable cell is
250-300 genes
comparative analysis
many genes can be identified by sequence similarity to genes found in other organisms
most abundant class of genes
metabolic genes
genes for protein coding are also abundant
Gene distribution in Archaea
typically devote a higher percentage of their genomes to energy and coenzyme production than bacteria
contain fewer genes for carbohydrate metabolism or cytoplasmic membrane functions than Bacteria
Transciptome
the entire complement of RNA produced under a given set of conditions
Microarrays
small solid-state supports to which genes or portions of genes are fixed and arrayed spatially in a known pattern
(sophisticated northern blot)
- can have high background (cross hybridization)
- requires/relies on known information on the gene