13- Chapter 9 Flashcards
What is genome and genomics definition?
Genome- entire complement of genetic information (includes genes, regulatory sequences, noncoding DNA)
Genomics- discipline of mapping, sequencing, analyzing, and comparing genomes
Look at slide 5-6 genome sizes
What is genome sequencing?
What does generation sequencing mean?
Determining the precise order of nucleotides in a DNA or RNA molecule
First generation sequencing is Sanger method
Most labs use second generation sequencing now
There is also third and fourth generation sequencing
Generation refers to successive major changes in sequencing technology that confer increases in speed and drop in cost of sequencing
What is the Sanger sequencing method?
Connecting a deoxynucleotide with another base molecule with the oxygen of phosphate to OH of deoxynucleotide
Slide 8-9
What is second generation DNA sequencing?
Generates data 100x faster than sanger method
Massively parallel (large # of samples sequenced side by side)
Uses increased computer power and miniaturization (454 life sciences pyrosequencing)
Slide 11
What is 454 sequencing system steps? 7 steps
- DNA is broken down into 400-600bp
- DNA adaptor added to both ends of fragment
- Each fragment is immobilized on a bead and amplified by PCR using primers that anneal to both ends
- Each bead put in a well with sequencing enzymes
- dNTPs added which releases pyrophosphate
- Luciferase enzyme emits light
- Instrument measures release of light
(Can only handle short stretches of DNA)
Slide 11
What is genome assembly?
Connects the DNA fragments in the correct order and eliminates overlaps
Done by a computer which examines short fragments sequenced and deduces order from overlaps and generates a genome for annotation
Can have a closed (complete) genome or a draft (small gaps) genome
Slide 14
What is annotation and bioinformatics?
Annotation- converting raw sequence data into a list of genes present in the genome (genome sequence is just letters, needs to be annotated by computers)
Bioinformatics- science that applies powerful computational tools to DNA and protein sequences for the purpose of analyzing, storing, and accessing the sequences for comparative reasons
What are ORFs?
Open reading frames that encode proteins
Computer algorithms looks for these stop/start codons or shine-dalgarno sequences
Slide 17
What are hypothetical proteins?
Uncharacteristized ORFs, proteins the likely exist but whose function is unknown
Likely encode nonessential genes
What is the genome size and content of prokaryotic genomes compared to eukaryotic?
Eukaryotic genomes contain a large fraction of noncoding DNA while prokaryotic do not
Prokaryotic genomes range in size from large viruses to eukaryotic microbes
Correlation between genome size and ORFs (slide 18)
As genome size increases, gene content proportionally increases
Genome sizes on slide 20
How are genes distributed in prokaryotes?
Metabolic genes are most abundant
DNA replication and transcription genes make up minor fraction of genome
# of genes with role that can be clearly identified in a given genome is 70% or less of total ORFs, others are hypothetical proteins Graph on slide 23
How does gene distribution reflect lifestyle?
Archaea Devote higher percentage of genomes to energy and coenzyme production than bacteria
Archaea contain fewer genes for carb metabolism or cytoplasmic membrane function than bacteria
Look at graph slide 24
What does homologous, gene families, paralogs, and orthologs mean?
Homologous- related sequence that implies common genetic ancestry
Gene families- groups of gene homologs
Paralogs- genes within an organism whose similarity to one or more genes in the same organisms is result of gene distribution
Orthologs- genes found in one organism that are similar to those in another organism but differ because of speciation
Diagram showing these on slide 27
What are gene duplications or gene deletions?
Gene duplications- mechanism for evolution of most new genes, having 2 genes that do the same thing is a waste so duplicate genes disappear eventually
Gene deletions- eliminate gene no longer needed or it can evolve a new function
Slide 29 example
What is vertical and horizontal gene transfer?
Vertical is parents passing down genes generations (genome replication and cell division)
Horizontal is transferring genes across things in real time
What are the 3 types of horizontal gene transfer?
Transformation- encountering DNA in environment and picking it up
Transduction- when virus takes genetic info from one host to another host
Conjugation- organism transfers part of genome into another organism
Mobile elements include plasmids, phage, transposons, and insertion sequences
Slides 30-31
What is the core genome and the pan genome?
Core genome- shared by all strains of species (all numbers of species)
Pan genome- includes all the optional extras present in some but not all strains of the species
Slides 33-35
Study the functional genomics terms and systems biology on slide 36
Okay
What is a metagome?
Total gene content of the organisms present in an environment
Microbiome studies
Slide 37
What is the transcriptome?
What are the two types? (Hybrid., RNA Seq.)
The entire complement of RNA produced under a given set of conditions
Hybridization can be used in conjunction with genomic sequence data to measure gene expression (microarrays)
RNA sequencing is deep sequencing of cDNAs that allows comprehensive quantitation of all RNAs in a cell
Slide 40
How do microarrays work with the transcriptome?
DNA segments/fragments on arrays are hybridized with mRNA from cells grown under specific conditions and analyzed to determine patterns of gene expression
Arrays are large and dense enough that the transcription pattern of an entire genome can be analyzed
Slide 41
How is RNA sequencing used in transcriptome?
Replaces microarrays for the analysis of gene expression
All RNA molecules of cell are sequences
Requires high throughput sequencing (2nd or 3rd generation sequencing)
rRNA very abundant and mRNA must be enriched from total RNA pool
What are the 5 things that can be derived from microarrays and RNA sequencing?
- Global gene expression
- Expression of specific groups of genes under different conditions
- Expression of genes with unknown function; can yield clues to possible roles
- Comparison of gene expression in closely related organisms
- Identification of specific strains
What are proteonomics?
What are the two ways to study it?
Genome wide study of structure, function, and regulation of an organisms proteins
1. 2D polyacrylamide gel electrophoresis- separates, identifies, and measures all proteins in first horizontal by isoelectric points and second vertical by size
2. High pressure liquid chromatography (HPLC)
Slide 44
What are interactomes?
Complete set of interactions among molecules
Data expressed in the form of network diagrams
Slide 45 picture
What is the metabolome?
Complete set of metabolic intermediates and other small molecules produced in an organism
Mass spectrometry is one of primary techniques for monitoring metabolites
Slide 46
What is systems biology?
Integration of different fields of “omics” research
Genomics
Proteonomics
Metabolomics
Compares data and builds a computer model of the system being studied
Slide 47-48