Lecture 15 Flashcards
genome
the complete set of genetic material present in a cell or organism
genomics
the cloning and molecular characterization of entire genomes
a haplotype
The specific set of SNPs and other genetic variants observed on a single chromosome or part of a chromosome
linkage disequilibrium
The nonrandom association between genetic variants within a haplotype
tag-SNPs
The few SNPs used to identify a haplotype
Genome-wide association studies use
numerous SNPs scattered across the genome to find genes of interest
annotated (gene) which means
linking its sequence information to other information about its function and expression, the protein it encodes, and similar genes in other species.
Metagenomics is an emerging field in which
the genome sequences of an entire group of organisms that inhabit a common environment are sampled and determined.(eDNA)
Synthetic biology seeks to
design organisms that might provide useful functions
Functional genomics
characterizes what sequences do—their function
Genome content consist of
much more than just protein-coding genes
Intergenic sequences. → “non-coding” DNA
Repetitive sequences → short and long sequences that repeat in tandem or are interspersed throughout the genome
prokaryotic and eukaryotic genomes differ drastically in
size & organization
prokaryote - attached to cytosol (no organells, DNA not in nucleus)
eukaryote - genome in distinct chromosomes - tightly bound to proteins
Anatomy of a prokaryotic genome
1) single, circular chromosome
2) Single origin of replication (req. for DNA rep. machenerary)
3) Genomes are compact
→. ~1-10 million bases (Mb)
4) Most content is genic
→ Minimal intergenic DNA (non- coding)
→ few repetitive sequences
→ No introns
5) Genome size is directly related to gene content
→ larger genomes encode more proteins
regulatory consequences of organization of prokaryotic genome
Genes in biochemical or signaling pathways often clustered and controlled as operons
Chromosome not sequestered in nucleus
Chromosome not bound by histone proteins
→ No chromatin
Eukaryotic genomes
1) Genomes divided into multiple linear chromosomes, with telomeres & centromeres
2) DNA complexed with histone proteins (=chromatin) in a nucleus
3) Genome size tends to be much larger, and varies widely, even within a taxonomic group
→ Genes interrupted by introns
→ Copious intergenic DNA
→ Copious repetitive DNA
4) Genomes don’t tend to be compact
5) With rare exceptions, genes not clustered into operons
6) Many genes (most human genes) are interrupted by introns; genes are far apart
C-value is the
DNA content per haploid cell
→ think of this as genome size (how many bp)
G-value is the
protein-coding gene number
(amount of DNA seq corresponds to coding protein)
G-value paradox
Gene number does not fully correlate with organismal complexity
G-value paradox explained by
(1) alternative splicing
(2) expansion/contraction
alternative splicing explanation for G-value paradox
Multiple exons from one gene can be spliced in different ways (=alternative splicing) to form distinct mRNAs and proteins
No. of proteins»_space; no. of protein coding genes
Explains smaller-than-expected gene count in multicellular spp.
expansion/contraction explanation for G-value paradox
Gene expansion & contraction is frequent, even among closely related spp.
gene duplication
family duplication
entire genome duplicated
C-value paradox
Genome size doesn’t fully correlate with organismal complexity