Chapter 18 Flashcards
Genomics, Bioinformatics, and Proteomics
Genomics allows sequencing of entire genomes
Genome: - Complete set of DNA in a single cell of an organism Genomics - the study of genomes - Structural genomics - Functional genomics - Comparative genomics - Metagenomics
Structural genomics
- sequencing genomes
- analyzing nucleotide sequences to identify genes and sequences such as gene regulatory elements
Shotgun cloning
whole genome sequencing (shotgun cloning)
- the most widely used strategy for sequencing and assembling an entire genome
1. Genomic DNA is cut into fragments called contigs
2. entire chromosome is assembled by computer program
3. fragments are aligned based on identical DNA sequences
Contigs
contigs - continuous fragments
- overlapping fragments adjoining segments that collectively form one continuous DNA molecule within chromosome
Bioinformatics
uses computer based approaches to organize, share, and analyze data related to:
- gene structure
- gene sequence and expression
- protein structure and function
Applications for bioinformatics
- comparing DNA sequences
- finding gene regulatory regions (promoters and enhancers)
- predicting amino acid sequences
- deducing evolutionary relationships between genes
Hallmark Characteristics of a Gene Sequence Can Be Recognized by Bioinformatics Tools
- several hallmark characteristics of genes (prokaryotes and eukaryotes) can be searched for using bioinformatics software
- Gene-regulatory sequences found upstream are marked by identifiable sequences such as promoters, enhancers, and silencers.
Open reading frames (ORF)
- sequences of triplet nucleotides translated into amino acid sequences of a protein
- suggestive of protein- encoding gene
- typically begin with initiation sequence ATG
- end with termination sequence (TAA, TAG, TGA)
Functional genomics
- Study of gene functions based on resulting RNAs or possible proteins they encode as well as regulatory elements
- Attempts to Identify Potential Functions of Genes and Other Elements in a Genome
Predicting Gene and Protein Functions by Sequence Analysis
Similarity searches
–Screen databases and compare sequence to known sequence
–Genome sequence statistically similar to gene with known function likely encodes for protein with similar function.
- Compares portions of human leptin gene (LEP) with homolog in mice (ob/Lep)
Homologous genes
- Genes that are evolutionarily related
- Similarity searches are able to identify homologous genes.
Orthologs - Genes from different species thought to have descended from common ancestor
HumanGenomeProject (HGP)
- Coordinated effort to sequence and identify all genes of human genome
- Illustrates that humans and all other species share common set of genes essential for cellular function and reproduction
Major features of human genome project
Two biggest surprises discovered from HGP
–There are less than 2% genome codes for proteins.
–There are probably only about 20,000 protein-coding genes.
The number of genes is lower than the number of predicted proteins.
Alternative splicing
Many genes code for multiple proteins through alternative splicing.
- Alternative splicing produces incredible diversity of proteins beyond number of human genes
- Alternative splicing patterns can generate multiple mRNA molecules = multiple proteins.
Human genome
Human genomic sequence is 99.9% the same, with most genetic differences resulting from
- Single-nucleotide polymorphisms (SNPs)
- Single-base changes in genome
- Variations associated with disease conditions - Copy number variations (CNVs)
- Segments of DNA duplicated or deleted
Comparative genomics
- Compares genomes of different organisms to answer questions about genetics and other aspects of biology
–Incorporates study of gene and genomic evolution–Explores relationship between organisms and environment–Studies differences and similarities between organisms and how differences contribute to phenotype, life cycles, and so on.
Eukaryotic genomes
Features of eukaryotic genomes not found in prokaryotes
–Gene density: Varies from chromosome to chromosome
–Introns: Variation in genomes and in genes
–Repetitive sequences: About half of human genome is repetitive DNA.
Sea urchin genome
in 2006, researchers completed 814 million bp genome of sea urchin Strongylocentrotus purpuratus.
- Sea urchins have an estimated 23,500 genes.
- Contain many genes with important functions in humans
- Have nearly 1000 genes for sensing light and odor
- Sea urchins and humans share approximately 7000 orthologs.
Dog genome
Dog genome completed in 2005
– Humans share 75% of their genes, as well as many genetic disorders, with dogs.
– Over 400 single-gene disorders
–Sex-chromosome aneuploidies
–Multifactorial diseases (e.g., epilepsy)
–Behavioral conditions (e.g., obsessive-compulsive disorder)
The Chimpanzee Genome
Comparison of human and chimpanzee sequences
–Differ by < 2%
–Share 98% of same genes
–Analysis indicates that genome evolution, speciation, and gene expression are interconnected.
The Neanderthal Genome and Modern Humans
Neanderthal (Homo neanderthalensis)
– Rough draft of Neanderthal genome encompassed 3 billion bp of Neanderthal DNA and 2/3 of genome.
– Comparative genomic analysis was used to identify where humans have undergone rapid evolution since diverging from Neanderthals.
–99% identical to humans
–78 new protein-coding sequences since divergence
transcriptome analysis
Transcriptome analysis (global analysis of gene expression)
–Studies expression of genes by genome qualitatively and quantitatively
1. Qualitatively: identifies which genes are expressed and which are not
2. Quantitatively: measures varying levels of expression of different genes
DNA microarray analysis
Microarrays (gene chips)
- Single-stranded DNA molecules attached to glass microscope slide complementary to all potential mRNA sequences from expressed genes
Proteomics Identifies and Analyzes the Protein Composition of Cells
Proteomics
–Identification, characterization, and quantitative analysis of all proteins (proteome) encoded by genome of cell, tissue, or organism
–Used to reconcile differences between number of genes in genome and number of different proteins produced
–Allows comparison of proteins in normal and diseased tissue
Proteomics Technologies: Two-Dimensional Gel Electrophoresis for Separating Proteins
Proteomics technologies: two-dimensional gel electrophoresis (2DGE)
–Technique for separating hundreds to thousands of proteins with high resolution
1.Proteins isolated from cell or tissue
2.Loaded on polyacrylamide tube gel
3.Separated by isoelectric focusing—causes proteins to migrate based on charge
DS-PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis)
–Second migration (after 2-D gel)
–Proteins separated by molecular mass
–Electric current applied to gel
System biology
Systems biology
–Incorporates data from genomics, transcriptomics, proteomics, and other areas of biology
–Interprets genomic information in context of structure, function, and regulation
–Network map: sketch showing interacting proteins, genes, and other molecules