Exam 1 Flashcards
genome
- the complete set of genetic material in the organism
- hereditary material of the organism
- composed of DNA
- includes DNA of chromosomes and any DNA in organelles (eukaryotes) or plasmids (prokaryotes)
chromosome
- a discrete unit of the genome carrying many genes
- each chromosome consist of a very long molecule of duplex DNA
- plus approximately equal mass of proteins
Number of chromosomes in different organisms
- humans - 46
- drosophila - 8
- corn - 20
- bacteria - 1 (circular)
- male jack jumper ant - 1
gene
a section of DNA on a chromosome that encodes for genetic information
structural gene
a gene that encodes any RNA or polypeptide product other than a regulator
allele
one of the several alternative forms of a gene
- slightly different DNA sequence
- hair color; height
- may have different alleles from mother and father
locus
- the position on a chromosome at which the gene for a particular trait resides
- it may be occupied by any one of the alleles for a gene
genetic recombination
- the rearrangement of DNA sequences by the breakage and rejoining of chromosomes
- due to such processes as crossing over in meiosis or transposition
- the consequences of such rearrangements is a novel combinations of alleles in the offspring that carry recombinant chromosomes
nucleotide
makes up DNA and RNA
- 5-carbon sugar
- phosphate attached to 5’ carbon of sugar
- nitrogenous base attached to 1’ carbon
DNA structure
deoxyribose sugar (2’-H)
RNA structure
ribose sugar (2’-OH)
nucleoside
contains
- a nitrogenous base linked to the 1’ carbon of a pentose sugar
- no phosphate attached
purines
- nine atoms - guanine and adenine
- larger than pyrimidines
pyrimidines
- cytosine and thymine in DNA
- uracil and thymine in RNA
- smaller than purines
DNA is a double helix
- a double helix consisting of two polynucleotide chains
- chains run antiparallel
nitrogenous base pairing
- the nitrogenous bases of each chain are flat purine or pyrimidine rings
- they face inward with the sugar-phosphate forming the external background
- the bases pair with one another by hydrogen bonding to form only A-T or G-C pairs
how many hydrogen bonds form between A and T?
2
how many hydrogen bonds form between G and C?
3
the phosphates provide a strong ______ charge
negative (in solution)
In Vitro, charge is neutralized by:
- sodium ions
- positively charged proteins
physical structure of DNA
- diameter of the helix = 20A
- one complete turn = 34A
- 10 bp per turn (about 10.4 in solution)
- 1A (Angstrom) = 0.1nm
- major and minor groove
forms of DNA
A-form
- dehydrated DNA
- shorter and thicker
B-form
- average structure
- right-handed helix turns clockwise along the axis
- found in aqueous conditions
Z-form
- left-handed helix
- long and narrow
RNA
- single stranded
- has ribose as the sugar (2’ OH)
- purines (A and G)
- pyrimidines (C and uracil)
- not as stable as DNA
- no base pair hydrogen bonds
- ribose -OH is more reactive
prion
- a proteinaceous infectious agent
- behaves as an inheritable trait even though it contains no nucleic acid
- one example is PrP^Sc, the agent of scrapie in sheep and bovine spongiform encephalopathy (mad cow disease)
central dogma
- information cannot be transferred from protein to protein, or protein to nucleic acid
- translation is unidirectional
- RNA may be converted into DNA by reverse transcription
DNA polymerase
an enzyme that synthesizes DNA from a DNA template
RNA polymerase
an enzyme that synthesizes RNA using a DNA template
reverse transcriptase
an enzyme that synthesizes DNA using an RNA template
- used by some viruses
intermolecular base pairing
- complementary base pairing between two different strands of nucleic acids
- DNA to DNA
- DNA to RNA
- RNA to RNA
intramolecular base pairing
complementary base pairing between different sections of the same nucleic acid
- RNAs
- tRNA
nucleic acids anneal by complementary base pairing
- heating causes the two strands of a DNA duplex to separate or denature
- melting temperature (Tm)
- the midpoint of the temperature range for denaturation
- reduce temperature
- complementary single strands can renature or anneal (hybridize)
filter hybridization
- denature a known DNA and attach to a solid filter
- denature unknown DNA in solution
- mix - if DNAs have similar sequences, they will anneal
- the ability of two single stranded nucleic acids to hybridize is a measure of their complementarity
hybridization can occur with
- DNA-DNA
- DNA-RNA
- RNA-RNA
it can be intermolecular or intramolecular
mutations
changes in the sequence of DNA
may occur
- spontaneously
- or induced by mutagens
mutagens
- chemicals that can cause mutations
- radiation (UV, gamma)
point mutation
changes a single base pair
may be due to
- chemical conversion of one base into another
- or errors that occur during replication
transition
a type of point mutation
- replaces a G-C base pair with an A-T base pair or vice versa
transversion
type of point mutation
- replaces a purine with a pyrimidine, such as changing A-T to T-A
insertions/deletions of larger DNA segments results from:
the movement of transposable elements (DNA segments that can be inserted in chromosomes
forward mutations
alter the function of a gene
back mutations (revertants)
reverse their effects
insertions can revert by ____
deletion of the inserted material
can deletions revert?
no
one gene one enzyme hypothesis
- suggested by Beadle and Tatum in 1940s
- a gene is a stretch of DNA encoding one or more isoforms of a single polypeptide chain
heteromultimer
a molecular complex (such as a protein) composed of different subunits
homomultimer
a molecular complex (such as a protein) composed of identical subunits
one gene one polypeptide hypothesis
a modified version
- a gene is responsible for the production of a single polypeptide
- then polypeptides are put together to form the enzyme
but: most genes do not encode polypeptides, but encode structural or regulatory RNAs
a locus can have many different mutant alleles
- multiple alleles
- wild type = w+ (red eye)
- various mutants: W^h (honey eye)
- since have 2 homologous chromosomes
- allows for heterozygotes with any pairs of combination of alleles
a locus can have more than one wild-type allele
- a locus may be polymorphic in alleles
- no individual allele is considered to be the only wild type
the genetic code is triplet
- it’s the code on mRNA (DNA>mRNA>protein)
- the genetic code is read in triplet nucleotides called codons
- the triplets are non-overlapping and are read from a fixed starting point
each codon triplet codes for:
- a specific amino acid
- or a stop codon
can codons code for the same amino acid?
yes
ex: UUU and UUC code for Phenylalanine
effects of mutations
- insertion or deletion of bases
- cause a shift in the triplet sets after the site of mutation
frameshift mutations
- insertion or deletion of three bases (or multiples of three)
- inserts or deletes amino acids
- but reading (or AA sequence) remains the same after the third insertion/deletion
- they happen after the deletion of 4 bases
open reading frame (ORF)
- a sequence of DNA consisting of triplet codons that can be translated into a string of amino acids
- starts with an initiation codon and end with a termination (stop) codon
every coding sequence has ___ possible reading frames
three
- usually only one of the three possible reading frames is translated
- the other two are closed by frequent termination signals (stop codons)
requirements for protein synthesis
- functional mRNA
- ribosome - a large complex of ribosomal RNA and proteins that synthesize polypeptides using an mRNA template
- tRNAs
tRNA
- a tRNA has an anticodon sequence that is complementary to the codon representing an amino acid
- each tRNA molecule is linked to that amino acid
what really matters in molecular genetics?
- protein coding genes
- regulatory sequences
- epigenetics
differences between humans and chimpanzees are most caused by ___
gene regulation
genes have DNA control sites
- proteins that regulate gene transcription bind to control sites next to the coding regions
cis
sites located on the same DNA strand
trans
sites located on different DNA strands
proteins are trans-acting but sites on DNA are cis-acting
- all gene products (RNA or polypeptides) are trans-acting
- they can act on any copy of a gene in the cell
- regulatory proteins are trans-acting
- they act on any gene regulatory region
- copies of the same protein can act on both homologous alleles
- a cis-acting DNA site controls expression of the adjacent DNA
- but does not influence the homologous allele on the other chromosome
- a mutation in the control site of a gene is cis-acting
- affects the adjacent gene
- does not affect the homologous allele
- a trans-acting mutation in a gene for a regulatory protein affects both alleles of a gene that it controls
original meaning of genetic engineering
- cloning genes by placing a gene DNA from one organism into another DNA or organism to allow it to be replicated
-ex: placing a mouse
enzyme gene into a
bacterial plasmid - creates recombinant DNA
- a DNA molecule
composed of sequences
from two (or more)
different sources
- a DNA molecule
genetic engineering now
- direct manipulation of an organism’s genome through the use of biotechnology to insert or delete genes
- often involves the production and use of recombinant DNA to transfer genes between organisms
restriction endonuclease
- enzyme that recognizes short specific sequences of DNA and cleaves the duplex
- it cleaves sometimes at the target site, sometimes elsewhere, depending on type of enzyme
nucleases
- hydrolyze phosphodiester bonds
- separates the nucleotides
endonuclease
- nuclease that cleaves phosphoester bonds within a nucleic acid chain
- breaks the chain
- it may be specific for RNA or for single-stranded or double-stranded DNA
- cleave within the strand
exonuclease
- nuclease that cleaves phosphoester bonds one at a time from the end of a polynucleotide chain
- chews off nucleotides from the end
- it may be specific for either the 5’ or 3’ end of DNA or RNA
- cleave at the terminal nucleotide
nucleases can be:
- broad specificity
e.g. exonuclease that cleaves any nucleotide from the end of DNA
e.g. pancreatic RNase = cleaves RNA after any pyrimidine - sequence specific - restriction endonucleases
-Type I, II, and III
Type II restriction endonucleases
- most common
- many derived from bacteria
- EcoRI from E.coli
- recognition sites are 4-8 bp
- sites are typically inversely palindromic
- reads the same forward and backward
restriction enzymes
cut DNA in two different ways:
-
staggered cut
-leaves “sticky ends” of complementary bases -
blunt ds cut
-no sticky ends
restriction mapping
- a map can be generated by using the overlaps between the fragments generated by different restriction enzymes
- used to find sites of restriction enzymes in your DNA
cloning DNA
cloning
- to make an identical copy of something
- DNA, Dolly the sheep etc.
- cloning DNA uses recombinant DNA
what you need to clone DNA
- an insert : the gene or DNA fragment you want to clone
- a cloning vector
bacterial plasmids
- bacteria have two different types of DNA
1) circular chromosome: genes for the bacteria to function
2) plasmids - small circular dsDNA
- often contain survival genes
- antibiotic resistance - ARGs
- self replicating
cloning vector
- a genetically engineered modified plasmid
- DNA that can be used to propagate an incorporated DNA sequence in a host cell
- often derived from a plasmid or a bacteriophage (virus that attacks a bacteria)
engineered plasmid vectors contain:
- replication origins
- selectable markers
- known restriction enzyme sites
replication origins
engineered plasmid vectors
- ORI (origin of replication initiation)
- where DNA replication can start
- allows replication of the plasmid
selectable markers
engineered plasmid vectors
- allows you to identify cells that contain the plasmid
- Ampicillin Antibiotic resistance gene
- lacZ gene
known restriction enzyme sites
-
multiple cloning site (MCS)
e.g EcoRI
multiple cloning site (MCS)
- a plasmid section containing a series of tandem restriction endonuclease sites used in cloning vectors for creating recombinant molecules
the DNA fragment insert
- fragments of genomic DNA that has been treated with restriction enzymes
- size selected from an agarose gel
- a PCR fragment
- specific DNA selected and amplified by PCR
- DNA fragments synthesized in the lab
cloning steps
- cut the insert DNA and the plasmid DNA with the SAME restriction enzyme
- easiest if use an enzyme that makes sticky ends - mix the insert and plasmid
- allows sticky ends to hybridize
- insert sticky ends anneals to the sticky ends of the plasmid - add DNA ligase enzyme
- seals the phosphodiester bonds - the plasmid circle is now
- sealed
- may contain the insert
- may be larger than before
Note: the insert has disrupted the lacZ gene
transformation of bacteria
- use the bacteria to replicate the plasmid
transformation - the process of introducing the DNA into a cell
competence
method for transformation
competent cell = capable of taking in DNA
-
high salt wash of calcium chloride (CaCl2)
- creates small holes or pores in the bacteria cell wall
- allows the DNA to bind to the bacteria to be taken into the cell -
electroporation
- add the bacteria and the plasmid to a small chamber
- add an electrical current to the liquid
- creates small holes or pores in the cell wall
- allows the DNA to be taken into the cell
note: both methods will transform only a fraction of the cells
grow the bacteria on agar with Ampicillin (only bacteria with plasmid will grow)
blue-white cloning vector
- plasmid has lacZ gene
-codes for bacterial gene for beta-galactosidase enzyme
-cleaves lactose sugar
-also will cleave X-gal -
multiple cloning site is in the middle of the lacZ gene
-cloning a DNA insert into the MCS will disrupt the lacZ gene
result of blue-white cloning vector
- bacteria without the plasmid will not grow
why? - agar has ampicillin and only bacteria with plasmid will survive - bacteria with the empty plasmid
-colonies will be blue - bacteria with the insert, has disrupted lacZ
-colonies will be white - so white colonies are the ones we want!
possible problems with transformation
- plasmid inserted into the plasmid
- disrupts the lacZ = white colonies
- but no insert - insertion of random, non-target DNA
- disrupts the lacZ = white colonies
solution:
- collect the plasmids
- restriction enzymes to remove the insert
- agarose gel to confirm size of the insert
- sequence the insert
other cloning vectors exist
if need to clone larger DNA segments
-
bacteriophage vector
-infects the bacteria and inserts the DNA into the bacterial chromosome
-package larger DNA segment
-can get the bacteria to express or produce the gene product -
cosmid
-plasmid with insertion sequences of a bacteriophage
-combines best of both -
YAC (yeast artificial chromosome)
-use yeast as the host
expression vectors
- control regions of genes contain promoters
-the region of DNA where the RNA polymerase binds to start transcription - for genes in plasmids/vectors to be:
-to be replicated = need ORI
-to be expressed (the protein produced) = need a promoter - expression vectors contain promoters that allow transcription of any cloned gene
two types of expression vectors
- continuously active (constitutive) promoters
- inducible promoters
continuously active (constitutive) promoters
- if it is in the vector, the inserted DNA will be transcribed and the protein produced ALL THE TIME
- use to have the bacteria/yeast/eukaryotic cell produce your protein
- recombinant protein: protein produced from recombinant DNA
- e.g. recombinant insulin produced by biotech company
inducible promoters
- the inserted DNA will only be transcribed when the promoter is turned on
- so, transform the bacteria/yeast/eukaryotic cell
- then you determine when to turn on the gene expression
- provide a chemical/hormone/etc. that turns on the promoter
reporter genes
- a gene attached to a promoter and/or your gene
- its product is an easily identified protein
- can show when and where the gene or promoter is activated
- used to measure promoter activity or tissue-specific expression
lacZ reporter
- insert your gene between the plasmid expression promoter and the lacZ gene
-when your protein is produced, it will be linked to beta-galactosidase enzyme - transform your bacteria/yeast/mammalian cell
- add X-gal
- wherever your protein is found, blue stain will be seen
your protein attached to B-galactosidase
localized expression of a protein
- insert your promoter and gene in front of the lacZ gene
-your gene and the lacZ gene are under control of your promoter
-when your promoter is activated, your protein will be linked to beta-galactosidase enzyme - transform - add X-gal
- wherever your protein is produced, blue stain will be seen
other reporter proteins
- green fluorescent protein (GFP)
-protein from the Aequorea victoria jellyfish
-is a green fluorescing protein - genetically mutated variations
-YFP - yellow
-CFP - cyan
fluorescent microscopy
- fluorescent molecules absorb light at excitation wavelength
- give off light at emission wavelength
-lower energy
-longer wavelength
common fluorochromes
- DAPI - stains DNA
- CFP
- GFP
- FITC - replaced by Cy2/Alexa488
detecting nucleic acids
fluorescent DNA and RNA stains
- ethidium bromide
- DAPI
- propidium iodide
- bind to the nucleic acid and fluoresce
- stain ALL nucleic acid
- cannot distinguish your gene from all other DNA
detecting specific DNAs
- based on the specific DNA sequence
- use nucleic acids hybridization and complementary base pairing
-
probe
-short DNA or RNA segment of known sequence of the gene you are looking for
-radioactively label the short DNA used to identify a complementary binding DNA or RNA
specific hybridization with a probe
- synthesizes a short DNA sequence that matches your gene
- label the probe
labeling probes
- radioactive phosphorous
- can use radioactive tritium
- end labeling
-use phosphorous and a kinase to add the phosphate to the end - labeling by incorporating a 32P labeled nucleotide
-DNA synthesis using DNA polymerase
-polymerase chain reaction (PCR)
- fluorescent labeled nucleotides
denaturing DNA
separating the DNA duplex strands
- must overcome the stable hydrogen bonds of base pairing
- more G-C content = more stable
- high temperature “melts” the DNA
- low salt concentrations
-high salt = Na+ stabilizes the phosphate backbone
-low salt = negative charged phosphates in backbone repel each other
nucleic acid detection
autoradiography
a method of capturing an image of radioactive materials on a photographic film
- ethidium bromide stains all DNA
- autoradiograph shows only the radioactive probe labeled DNA binding to the complementary DNA
DNA separation techniques
agarose gel electrophoresis
- uses agarose as a matrix gel
-can use polyacrylamide - uses an electric current to cause the DNA to migrate toward a positive charge
- remember - DNA has a negative charge due to phosphate backbone
- separates DNA fragments by size (smaller travel faster = end up further)
- compare to migration of known DNA size standards
difficult to probe agarose gels:
- transfer the separated DNAs to:
-a stronger matrix
-an easier to manipulate matrix - nitrocellulose membrane
-good but easily cracks - nylon membrane
-strong
southern blotting
- invented by Dr. Edwin Southern
- transfer of DNA from a gel to a membrane (in alkaline solution to denature the DNA)
- followed by detection of specific sequences by hybridization with a labeled probe
northern blotting
- involves the transfer of separated RNA from a gel to a membrane
western blotting
- separation of proteins on a sodium dodecyl sulfate (SDS) gel
- transfer to a nitrocellulose membrane
- detection of proteins of interest using antibodies
- proteins are separated by size
isolating mRNA
- many types of RNA
1. primary RNA transcripts
2. mRNA
3. tRNA
4. rRNA
5. small heterogeneous RNAs - mRNA always have a poly-adenylated tail
- string of deoxyAdenines added to the 3’ end
- can isolate mRNA from everything else by using Oligo (dT) bound to beads
- short strands of deoxy Thymidine
oligo (dT) column
- oligo (dT) bound to Sepharose beads
- packed into a small column
- add RNA mixture to the column
- mRNA binds to the Oligo(dT)
- wash out the other RNAs to pass through
- then pass through denaturing solution to wash out mRNA
polymerase chain reaction (PCR)
- developed by Kary Mullis - Nobel prize in 1993
- continuous cycling of
- denature DNA
- hybridize primers
- allow polymerase to copy the DNA
- repeat 20-40 times
- exponential amplification of a desired sequence
PCR requires:
- DNA primers that flank the gene you want
- single stranded primers complementary to 3’–>5’ end of gene on each strand of DNA
- primers run 5’–>3’
- 18 to 25 bases long
- easily synthesized (purchased)
polymerases require:
- a DNA or RNA template
- a free 3’ end to add nucleotides to
-satisfy this using primers - must know sequence to make the primers
- target DNA
-genomic DNA containing a gene you want
-pieces of DN of a gene you want - Taq polymerase
-from Thermus aquaticus bacterium
-lives in very hot springs
-very temp resistant
-works best at 72 degrees - excess of the four deoxynucleotides (dNTPs) to make DNA
PCR steps
A. mix template, primers, Taq, and dNTPs
B. heat to 95-100 degrees
- melts (denatures) the DNA template
- 15 seconds
C. rapidly cool to temperature optimal for primers to anneal to template
- too low = primers anneal anywhere
- too high - primers won’t anneal
- is different for each set of primers
- 30 seconds
D. heat to optimal temperature for Taq polymerase to function
- 72 degrees
- 30 seconds
Result:
- one new copies of the target gene
- plus the original copy
repeat 20-40 cycles
- 2nd cycle - 4 copies of the gene
- 3rd cycle - 8 copies of the gene, now see copies that are only the gene plus the primers
- by 20 cycles - almost all are the gene+primers only, up to a million copies
PCR primers
- must be long enough to be specific for your DNA only
- usually ~20 nucleotide sequence is specific
- for any 20 nucleotide sequence
-occurs ONCE in 4^20 nucleotides
-human genome is 3.2x10^9 base pairs
-so will occur only once in the genome
-many use 23 to 25 nucleotides for primers
uses for PCR
- MUST know nucleotide sequences at the ends of the DNA to be amplified
- amplify short DNAs without having to use plasmids and cloning
-works great for single genes or gene sequences
-does not work for large DNAs or genomic DNA
-generally best with 100 to 500 bp of DNA - isolate DNA segments from mutants
- excellent for producing DNAs for cloning in E.coli
- use to label DNA for probes
-add fluorescent dNTP - identify bacteria/virus/infection
reverse transcription
- transcribes single-strand RNA into single-strand complementary DNA (cDNA)
- from retroviruses
- transcribe viral RNA into cDNA
- integrate viral DNA into the host genome
- requires a DNA primer
-short strand of DNA complementary to the RNA to be copied
reverse transcriptase primers
- oligo(dT) primer for mRNA
- random primers for any RNA
- mixture of short random sequences (usually hexomers) - gene-specific primers for specific RNAs
- determined from the gene sequence
reverse transcription-PCR (rtPCR)
- isolate RNA
- reverse transcribe to cDNA
-use oligo(dT) for mRNA
-makes cDNA from ALL mRNA only - add specific PCR primers, dNTPs, and Taq polymerase
-first round copies ONLY the specific cDNA to be dsDNA
-subsequent rounds amplify the specific cDNA
real-time PCR, or quantitative (qPCR)
- detects the PCR products during PCR amplification
- fluorescent dye to label dsDNA
- allows monitoring the increase in fluorescent labeled PCR products
- is more sensitive and quantitative than conventional PCR
- can now make copies for jsut about any gene by cloning and expanding E.coli or PCR
can we study more than one gene at a time?
Yes! DNA microarrays
- backwards of the southern/northern blots
- immobilize the known DNAs (different genes) on a membrane
- then add the labeled unknown nucleic acid sample
- see where the labeled nucleic acids bind (hybridize)
DNA arrays
- initially, 30-100 different DNAs “spotted” on membranes and dried
- 1000 DNAs spotted onto glass slides
- thousands to millions of DNAs spotted onto silicon chips
- use fluorescent labels instead of autoradiography
-allows better resolution
-automated optical microscopy for detection - allows comparisons
-experimental to control
-disease versus normal
DNA microarrays
- label control mRNA/cDNA with green during reverse transcription
- label experimental mRNA/cDNA with red
- mix together and add to the microarray
- ssDNAs bind to specific gene DNAs on the DNA chip
- wash and scan
Bacterial genes
- usually continuous sequences of nucleotides
- encodes the amino acids to create a polypeptide
NOT TRUE FOR EUKARYOTES
eukaryotic genes are interrupted
- the coding sequences are not continuous
- coding regions of eukaryotic genes are interrupted by segments of non coding regions
note: not necessarily true for yeasts
exons
the DNA sequences of the gene that code for the gene product (polypeptide or RNA)i
intron
- a segment of DNA that does not code for the product
- lie between exons
primary (RNA) transcript
- the original unmodified RNA product that is transcribed
- contains the exons and the introns
- is only a precursor RNA
- it is NOT the final mRNA
making the mature transcript
- a mature mRNA is made by RNA splicing of the primary RNA transcript
- the process of
-excising introns from RNA
-then connecting the exons into a continuous mRNA
primary RNA transcript is equal to the ____ ____ minus ____
primary RNA transcript is equal to the entire gene minus the control DNA sequences
mRNA is just the
exons
removal of introns by RNA splicing occurs in __ in individual RNA molecules
cis
- affects only within the same RNA
- does not affect other RNAs
- exons remain in the same order in mRNAs as in DNA, but distances along the gene are not the same as those of mRNA or polypeptide products
effect of mutations
- mutations in exons can affect polypeptide sequence
- mutations in introns do not directly affect the polypeptide sequence
- but may affect RNA processing
- mutations at the exon-intron junctions may affect the splicing event
- point mutations may yield stop codons affecting the final protein
how are introns detected in genes?
- compare the gene to the mRNA
- usually compare DNA gene to cDNA of the mRNA
restriction mapping
- treat gene DNA and cDNA with the same restriction enzyme(s)
- restriction sites in the gene and mRNA/cDNA are the same in exons
- restriction sites in introns are missing in the mRNA/cDNA
- introns can also be detected by sequencing the gene and the cDNA, then matching the two together
does an intron have an open reading frame?
no
an intact open reading frame is created in the mRNA sequence by removing the introns
gene structure of other DNAs
- genes coding for polypeptides, rRNA, and tRNA can all have introns
- introns have been found in every class of eukaryote
- introns are rare in prokaryotic genes
- animal mitochondria do not have introns
- but some plants, fungi, and protists do
members of gene families have a common gene organization
e.g. mammalian genes for dihydrofolate reductase
- the number of exons and introns is maintained
- the relative position of the introns to the exons is maintained
- but the length of the exons and especially the introns can vary
exon versus intron sequences
- normally, exon sequences do not vary much from species to species for the same gene
- sequences of introns show differences from species to species for the same genes
why?
- introns lack of selective pressure to produce a polypeptide with a useful sequence
- while exons must produce a useful product
(not always true for exons producing beneficial mutations)
some DNA sequences encode more than one polypeptide
1. alternative start codons in the same reading frame
- alternative initiation or termination codons allows multiple variants of a polypeptide chain
- produces a short form and full-length form of the polypeptide
2. overlapping gene - a gene in which part of the sequence is found within part of the sequence of another gene
- different polypeptides can be produced from the same sequence of DNA
- the mRNA is read in different reading frames (as two overlapping genes) so sequence is different
- found in some viral and mitochondrial genes
3. alternative splicing of the primary transcript
- one gene may be alternative spliced to exclude exons or choose between alternative exons
- yields otherwise identical polypeptides, differing by the presence or absence of certain regions
some exons correspond to protein functional domains
- a discrete part of an amino acid sequence that has a particular function (e.g. the immunoglobulin domain)
- exons may be functional building blocks of genes
- genes that share related exons may code for proteins with similar functions
- possibly suggests a common exon ancestry
- low-density lipoprotein (LDL) - gene has 18 exons
- gene has similar exons as
- complement 9
- EGF
gene family
- a set of genes within a genome that encodes identical or related proteins of RNAs
- the members were derived by duplication of an ancestral gene
- followed by accumulation of changes in sequence between the copies
- most often the members are related but not identical
superfamily
- a set of genes all related by presumed descent from a common ancestor, but now showing considerable variation
- myoglobin = oxygen binding protein in animals
- similar amino acid sequence to alpha globin and beta globin
- leghemoglobins = oxygen binding protein in legume plants
together make up the Globin Superfamily
members of a gene family have a common gene organization
- similar exons
- Leghemoglobin gene has an extra intron
- suggests they are descended from a single ancestral gene
orthologous genes (orthologs)
- related genes in different species
- should share common features that preceded their evolutionary separation
- rat has two different genes for insulin
- one similar to chicken insulin gene
- one missing an intron
- suggests the 1-intron gene evolved from the other
genome
- the complete set of DNA sequences in the genetic material of an organism
- it includes the sequence of each chromosome plus any DNA in organelles
transcriptome
- the complete set of RNAs present in a cell, tissue, or organism
- its complexity is due mostly to mRNAs, but it also includes noncoding RNAs
- mRNAs, tRNAS, rRNAS, microRNAs, etc
proteome
- the complete set of proteins expressed by the entire genome
- also could be the proteins expressed by a cell at any one time
interactome
- the complete set of protein complexes and protein-protein interactions present in a cell, tissue, or organism
- multiproteins or complexes
- DNA and RNA polymerase haloenzymes
- enzymes clustered into metabolic pathways
linkage maps
- based on the frequency of recombination between genetic markers
- monitoring genetic cross-over occurrence and the resulting phenotype
restriction maps
- based on the physical distances between markers
- using restriction enzymes
sequencing genomes
DNA is sequenced to identify the position of functional genes, introns, exons, etc.
polymorphism
variation in sequence between individuals can be seen:
- at the phenotypic level when a sequence affects gene function or a characteristic (variations in eye color)
- at the restriction fragment level when it affects a restriction enzyme target site
- at the sequence level by direct analysis of DNA
- however, a different in gene sequence may not result in a difference in phenotype
changes in sequence at a single locus…
may change the DNA sequence but:
- NOT change the polypeptide sequence
-redundant codons
-changes in introns - change the polypeptide sequence but NOT change the polypeptide function
-amino acids with similar characteristics - may change the polypeptide function
- result in altered polypeptides that are not functional
single nucleotide polymorphism (SNP)
- a polymorphism caused by a change in a single nucleotide
- responsible for most of the genetic variation between individuals
haplotype
the particular combination of alleles in a defined region of some chromosome; in effect, the genotype is miniature
nonrepetitive DNA
generally encodes for polypeptides
repetitive DNA
- sequences present at more than one copy in the haploid genome
- larger genomes within a taxonomic group do not contain more genes but have large amounts of repetitive DNA
- most bacteria have all nonrepetitive DNA
- larger animals and plants have large amounts of repetitive DNA
why have repetitive (junk) DNA?
- sequences without any apparent function
- may have functions we do not understand??
-gene control regions (promoters)
-code for microRNA - a large part of moderately repetitive DNA can be made up of transposons
-short sequences of DNA (up to ~5000 bp)
-have the ability to move to new locations in the genome
-can make more copies of themselves
which eukaryotic organelles have DNA?
mitochondria and chloroplasts
extranuclear genes
- genes that reside outside the nucleus, in organelles such as mitochondria and chloroplasts
- organelle genomes are usually (but not always) circular molecules of DNA
- mitochondrial DNA
- chloroplast DNA (cpDNA or ctDNA)
animal cell mtDNA
- typically encodes 13 proteins, 2 rRNAs, and 22 tRNAs
- proteins are ones involved in respiration/electron transport Complexes I to IV
mitochondria and chloroplasts evolved by endosymbiosis
- both mitochondria and chloroplasts are descended from bacteria ancestors
- most of the mitochondrial and chloroplast genes have been transferred to the nucleus during the organelles evolution
- mitochondria originated by an endosymbiotic event when a bacterium was captured by a eukaryotic cell
where did introns come from?
- not found in bacterial genomes
- hypothesis = the earliest genes did not contain introns
-introns were subsequently added to some genes - how? possibly they have always been an important part of the gene
-splice sites, providing the correct reading frame, etc.
the human genome has fewer genes than originally expected
- originally estimated that the human genome contained 30-40,000 genes
- sequencing revealed the actual number is ~20,000
- only 1% of the human genome is exons
- 24% are introns
- so genes are only about 25% of the genome
- most is repetitive DNA
exon shuffling
- the hypothesis that genes have evolved by the recombination of various exons encoding functional protein domains
-creating proteins with greater function and value - most successful shufflings were once where the exons was flanked by intron sequences
-containing 5’ and 3’ splicing sites on either side of the exon - exons were inserted into large introns
- however, only 1 in 3 chance of a correct reading frame
repeated sequences account for more than ___% of the human genome
50%
- the majority of repeated sequences are copies of nonfunctional transposons
- Pseudogenes
- tandem repeats at the centromere and telomeres
mouse genome has ~22,000 genes
- 12% (~3000) code for RNAs
-rRNAs, tRNAs, regulatory RNAs, etc. - 4.8% (1200) are pseudogenes
pseudogene
- stable but inactive genes derived by mutation of an ancestral active gene
- often inactive due to mutations that stop transcription or translation (or both)
what percent of proteins are essential for ALL life and what are they called
21%
housekeeping genes
housekeeping genes
- transcription and translation
- metabolism
- transport
- DNA replication and modification
- protein folding and degradation
morphological complexity evolves by adding new gene functions
- as morphological complexity increases, additional genes are needed with increased complexity
- most of the genes that are unique to vertebrates/animals are concerned with the immune or nervous systems
striking feature of the human genome
- there are many more unique proteins compared to other eukaryotes
- but relatively few unique protein domains
- most protein domains are common to animals
- the greatest proportion of unique proteins are the transmembrane and extracellular proteins
- cell-cell communication secreted proteins and receptors
gene duplication contributes to genome evolution
- exons can be modules for building new genes
- exon could be copies and used in another gene
- provides new function
- enzyme function
- structural function
- entire gene (exons + introns) is duplicated
- duplicated genes can then allow mutations to collect and evolve
- becomes a new gene/function
- or becomes a pseudogene
- as long as the original gene stays functional
pseudogenes have lost their original functions
- copies of functional genes with altered or missing regions
- produce polypeptides that are nonfunctional or have altered functions
gene clusters
- a group of adjacent genes that are identical or related
- may be simply two adjacent identical genes to hundreds of identical genes in a tandem array
- e.g. Immunoglobulin genes have:
- ~300 variable region gene segments
- 20 D region segments
- 6 J region segments
- 9 heavy chain region segments
why do some genes cluster?
- tandem repeats due to a need for large amount of the product
- rRNA for protein synthesis
- histone proteins for replicated DNA
heterochromatin
- regions of chromosomes that are permanently tightly coiled
- DNA is not active
euchromatin
- parts of chromosomes that are less tightly coiled
- contain most of the active or potentially active genes
nucleolus
the region in the nucleus where:
- the rRNAs are produced
- the rRNA and proteins are put together to form ribosomes
parts of the nucleolus
Fibrillar core
- where the rRNA is transcribed from the DNA template (rDNA)
Granular cortex
- area around the fibrillar core
- where the rRNAs and proteins are actually put together to form the ribosome
nucleolus is NOT an organelle but rather ____
a region where chromosomes “stick” their DNA to pre-make ribosomes
each nucleus can have many nucleoli, but usually only have one or two
ribosome
- composed of **rRNA(( and protein (ribonucleoprotein)
- two subunits:
1. large subunit (60S) - 5S rRNA
- 28S rRNA
2. small subunit (40S) - 18S rRNA
genes for rRNA form ____ ____ of the same transcription unit
tandem repeats
- rRNA is encoded by a large number of identical genes that are tandemly repeated to form one or more clusters
ribosomal DNA (rDNA)
- DNA where the ribosomal genes are located
- the genes in an rDNA cluster all have an identical sequence
- each rDNA cluster organization
- transcription units for a precursor containing both the major rRNAs alternating with nontranscribed spacers
nontranscribed spacers
- shorter repeating units whose number varies so that the lengths of individual spacers are different
- allow several RNA polymerases to attach and transcribe rRNAs at the same time
the genome contains highly repetitive DNA
very short sequences of DN repeated many times in tandem and in large clusters
- also called satellite DNA
- no coding function
- usually less than 10% of the genome
satellite DNAs
- DNA that consists of many tandem repeats (identical or related) of a short basic repeating unit
- repeating unit is ~100bp or more
- usually located in heterochromatin
- commonly found at the centromeres of chromosomes
- suggests it has a structural role in chromosome segregation in mitosis and meiosis
mini and microsatellite DNA
- DNAs of tandemly repeated copies of a short repeating sequence
- minisatellite DNA - length of repeating unit is ~10 to 100 bp
- microsatellite DNA - length of repeating unit is usually less than 10 bp
- the number of repeats varies between individual genomes
the length of the mini- and microsatellites:
- varies quite a bit between individuals
- very consistent for a single individual
- therefore mini- and microsatellites lengths are unique for an individual
- one set of sizes inherited from the mother
- a different set of sizes inherited from father
- we can use these unique mini- and microsatellites to identify individuals
DNA profiling
- PCR of a person’s DNA with a mixture of primers for unique sequences flanking the microsatellite DNAs
- PCR amplify these microsatellite DNAs
- creates a DNA fingerprint of the microsatellites for that person
- these DNA fingerprints are unique for that person
forensic DNA profiling
- collect DNA from crime scene
- collect DNA from suspects
- analyze microsatellite DNA fingerprints
- compare for matches
DNA profiling used to establish paternity
- collect DNA from mother, child, and potential fathers
- child’s DNA fingerprint is inherited from mother and father
- child’s DNA fingerprint will match with both