chapter 12 - genomes Flashcards
define forward genetics
starting with the phenotype and identifying the underlying genes
define reverse genetics
starting with the DNA sequence of a gene and identifying the encoded phenotype (function)
what is reverse genetics used for?
predicting amino acid sequence of a protein, predicting protein structure and function, mutating genes to see the effect on the organism
what is the Sanger method of DNA sequencing?
using chemically modified ddNTPs along with dNTP nucleosides. synthesis stops when a ddNTP is added to the polynucleotide chain by DNA polymerase because it has no 3’ hydroxyl (OH) group. the sequence can be determined by detecting the 3’ base at each position in the DNA
what are the necessary components of sequencing rxns?
DNA polymerase
short (18-22 bp) primers complementary to the template strand
all 2’ dNTPs (dATP, dCTP, etc.)
all 2’, 3’ ddNTPs (ddATP, ddCTP, etc.) - labeled uniquely by fluorescence so that each can be detected
template DNA (sequencing rxn only reliably reads 100-700 bps of a template)
why is the ratio of dNTPs to ddNTPs important?
the ratio determines the distribution of DNA fragment lengths that are produced - more dNTPs than ddNTPs
what are two methods for amplifying template DNA?
PCR
isolating recombinant DNA clone from bacteria
each primer for PCR amplification can be used to sequence how many strands of PCR product?
one
what is high-throughput sequencing used for?
to simultaneously sequence many different template molecules
what does high-throughput sequencing involve?
physical binding of template DNA to a solid surface/microbeads & amplification of the templates by PCR
how many high-throughput sequencing rxns can be completed at once? what is this level of sequencing called?
thousands - millions
massively parallel DNA sequencing
what is the key to sequencing a genome?
generating many short fragments of DNA and identifying where the fragments overlap
define functional genomics
identifying and annotating function of various parts of the genome
define open reading frames
series of codons that is not interrupted by a stop codon - the longest stretches of sequence that go uninterrupted
define comparative genomics
comparing differences in genomes of species - both across species and within them - to answer questions about the species
define a transcriptome
a collection of RNA molecules in the genome
define proteomics
the study of the proteome - complete complement of proteins produced by an organism
what allows for the diversity of proteins?
many genes can encode for more than one protein at a time - the # of proteins in a genome is larger than the # of protein-encoding genes
what are three methods of measuring a proteome?
gel electrophoresis
mass spectrometry
using antibodies
define metabolomics
a complete analysis of metabolites (substances produced during metabolism) in a biological system
define metabolomes
the complete set of small molecules in a cell, tissue, or organism
explain primary vs secondary metabolites
primary: normal cell processes (hormones, signaling, etc)
secondary: unique to an organism (antibiotics for defense, etc)
what are the key features of a prokaryotic genome? (8 features)
1 circular chromosome, small, compact & efficient, mostly protein-coding regions, no introns, some carry plasmids, great diversity, core genome > pan genome
what accounts for the diversity of prokaryotic genomes? (2 reasons)
varying number of genes and the occupancy of a variety of environments
what is the core genome?
all of the genes that the strain of a species have in common
what is the pan genome?
all of the genes that are present in at least one of the strains
what is the minimal genome?
the minimum number of genes that are necessary for an organism to survive
what does it mean to “knock-out” a gene?
to remove a specific gene from an organism (like bacteria) to see if the organism thrives without it or not - if it still thrives, the gene is not a requirement for life; if not, the gene is part of the minimal genome
what are the key features of eukaryotic genomes? (7 features)
linear chromosomes, many introns, no plasmids, core genome = pan genome (roughly), more regulatory sequences, larger and less efficient, much of the DNA does not encode proteins
what is a gene family?
a group of copies of a gene resulting from duplication events
what is a pseudogene?
if a gene product is present, but nonfunctional
what is a paralog?
genes that arise from duplication
ex. globin family - different globin genes are expressed at different times in the human life cycle
how many paralogs make up a gene family?
2+
what is a transposon?
genetic “parasites” with repeating sequences that can move around
what are the 2 main types of transposons?
DNA transposons - “cut and paste” - removed and put somewhere else
retrotransposons - “copy and paste” - copied so that one copy stays in the same place and the other is put somewhere else
how much of the human genome do protein-coding regions make up?
<2%
how many base pairs does the average gene in the human genome have?
~27,000 bp (some have function and some don’t)
which chromosome has the most genes and which has the fewest?
chromosome 1 has most - 2,968
Y chromosome has least - 231
genes are not evenly distributed over the genome
how much of the human genome is the same in all people?
97%
what is a polymorphism?
different alleles for different genes - variation in DNA sequence
what is a single nucleotide polymorphism (SNP)?
an inherited DNA sequence variation that occurs when a single nucleotide (A, C, G, T) is altered
what is a short tandem repeat (STR)?
short repetitive sequences that occur side by side on chromosomes, usually in noncoding regions
ex. GTGAATC GTGTGTGT ACCGTCA
4 repeats
can be amplified by PCR
how do restriction enzymes work & why are they used?
they can identify specific DNA sequences and cleave the sequence into smaller fragments. they are used to identify SNPs, insertions, and deletions in restriction sites
how does gel electrophoresis work?
separates DNA fragments by size:
smaller fragments move easier through the gel & are able to get closer to the positive end (the far end)
what are restriction fragment length polymorphisms?
mutations at restriction sites - identify single nucleotide differences
are observed as bands on electrophoresis gel
what is allele-specific oligonucleotide hybridization?
occurs when DNA probes are hybridized to PCR products
only hybridize to one version of allele
detection through fluorescently labeled probes
how do DNA probes show hybridization?
if the sample is heterozygous, both probes with bind to the known sequence of DNA and fluoresce because the two probes exhibit both allele possibilities
if the sample is homozygous, only one probe will bind to the DNA and fluoresce because both sequences will exhibit the same sequence for the allele
what is a mendelian trait?
a trait affected by a single gene with 1 dominant and 1 recessive allele
what is quantitative variation?
traits that are continuous over a range
ex. height, weight, human skin color
ex. hypertension, asthma, type II diabetes
what causes phenotypic variation?
action of multiple different genes that can also be influenced by the environment
what does a transcriptome do?
determines what genes are expressed in what amounts (RNA sequencing)