Lecture 18 (RR5): Proteins that regulate transcription --> activators Flashcards
Promoter-Proximal Elements that help regulate eukaryotic genes.
What are the two strategies that can be used to identify the critical regions within a control region upstream of a gene?
(1 - identifying the regulatory elements)
Cis-acting regulatory sites can be found through linker scanning mutations subregions of the promoter required for activation of transcription. The use of linker scanning mutagenesis, for example, can pinpoint the sequences within a regulatory region that function to control transcription.
Two strategies to identify the critical regions within a control region upstream of a gene:
1) Using 5’ deletions to set up a series of variants of a given control region to identify regions that might be important for expression of the downstream genes.
2) Linker scanning analysis → identify specific regions within a LARGER control region important for regulation of transcription. A set of constructs with contiguous overlapping mutations are assayed for their effect on expression of a reporter gene or production of a specific mRNA.
Electrophoretic mobility shift assays (EMSA) - identifies, logic, pros and cons?
EMSA (Electrophoretic Mobility Shift Assay):
- Identifies: DNA bound to a protein (in this case, the protein is a Transcription Factor)
- Logic: DNA bound to protein will be bigger and move slower than DNA that is not bound to a protein.
- Pro: EMSA is a relatively easy assay, better for quantitative analysis of DNA binding proteins
- Con: does not reveal the specific sequence of DNA that the transcription factor binds to
EMSA Procedure
(2- identify the proteins that bind to regulatory element)
- EMSA or gel/band/mobility shift is used to determine DNA binding activity. Used to identify proteins and eventually purify proteins that interact specifically with those DNA segments.
- A radiolabeled dsDNA segment is used as a probe. Use the probe to ask yourself the question: “is there a protein complex or single polypeptide, in a given sample, that will interact specifically with the DNA and form a DNA protein complex?”
→ to answer this question, we run one of these assays through an electrophoretic field and if there is a DNA protein complex that forms, the mobility of that complex will be very different than the free probe.
Remember:
* Protein: DNA mobility is altered in the non-denaturing polyacrylamide gel
* EMSA/Gel Shifts cannot reveal the precise sequence that is bound by the protein!!!
**In this example: **
- protein fractions separated by column chromatography were assayed for their ability to bind to a radiolabeled DNA-fragment probe containing a known regulatory element.
- An aliquot of the protein sample that had been loaded onto the column (ON = the control) and successive column fractions (numbers) were incubated with the labeled probe.
- The samples were then electrophoresed under conditions that do not disrupt protein-DNA interactions.
→ The free probe not bound to protein migrated to the bottom of the gel.
→ A protein in fractions 7 and 8 bound to the probe (as did protein in the unfractionated sample in column ON), formed a DNA-protein complex that migrated more slowly than the free probe. These fractions are therefore likely to contain the regulatory protein being sought.
→ If you want to purify the protein, then you get rid of the rest of the fractions and keep fraction 7 and 8 to continue purifying these until you get a protein fraction that is a significant enrichment, itt might come down to a single polypeptide.
DNase I footprinting
DNase I footprinting:
- Identifies: specific sequence of DNA bound by a transcription factor
- Logic: The transcription factor will protect the sequence of DNA it is bound to from DNase digestion
- Pro: identifies the precise binding site of a transcription factor
- Con: more technically challenging than EMSA
- DNase I footprinting takes advantage of the fact that when a protein is bound to a region of DNA, it protects that DNA sequence from digestion by nucleases.
How do you isolate transcription factors?
In the biochemical isolation of a transcription factor, an extract of cell nuclei is commonly subjected sequentially to several types of liquid chromatography.
- Fractions eluted from the columns are assayed by DNase I footprinting or EMSA using DNA fragments containing an identified regulatory element.
- Fractions containing a protein that binds to the regulatory element in these assays contain a transcription factor.
- A powerful technique that is commonly used for the final step in purifying transcription factors is sequence-specific DNA affinity chromatography: long DNA strands containing multiple copies of the transcription-factor-binding site are coupled to a column matrix.
What assay is used to test DNA binding transcription factors?
(3- how do these proteints interact with regulatory elements).
Once a transcription factor has been isolated and purified, its partial amino acid sequence can be determined and used to clone the gene or cDNA encoding it. The isolated gene can then be used to test the ability of the encoded protein to activate or repress transcription in an in vivo transfection assay
1) The assay system requires two plasmids:
→ You put the cDNA that encodes for the TF (protein) into one plasmid.
→ The second plasmid contains a reporter gene (e.g., luciferase, GFP) and one or more binding sites for protein X.
2) Both plasmids are simultaneously introduced into cells that lack the gene-encoding protein X. The production of reporter-gene RNA transcripts is measured; alternatively, the activity of the encoded protein can be assayed.
3)If reporter-gene transcription is greater in the presence of the X-encoding plasmid than in its absence, then the protein is an activator; if transcription is less, then it is a repressor.
Transcription factors recgonize specific DNA sequence motifs.
1) Transcription factors recognize specific DNA sequence motifs
- Alpha-helical domain, the so-called RECOGNITION HELIX.
- Binding occurs through non-covalent interactions with atoms in the bases.
- Interaction with the major groove of DNA.
Specific protein-DNA interactions: helix-turn-helix motif
- Bacteria use DNA binding transcription factors very often to control a transcription of genes required for these rapid changes in their environment.
- Many bacterial repressors are dimeric proteins in which an α helix (enriched in positively charged basic residues) from each monomer inserts into the major groove in the DNA helix and makes multiple, specific interactions with the atoms there.
- This α helix is referred to as the recognition helix or sequence-reading helix because most of the amino acid side chains that contact bases in the DNA extend from this helix.
- The recognition helix, which protrudes from the surface of a bacterial repressor, is usually supported in the protein structure in part by hydrophobic interactions with a second α helix located just to the amino-terminal side of it.
- This entire structural element, which is present in many bacterial repressors, is called a helix-turn-helix motif.
Transcription factors are modular
- Transcription factors are modular. THis means that they have a DNA binding domain, one or more transcriptional activation domains (or repressor) and often a dimerization domain.
- Many Transcription factors are intrinsically disordered regions.
the GAL4 transcription factor from yeast
An example of an analysis done in yeast
* Gal 4: important transcription factor in yeast that is required when yeast is placed into an environment where it has galactose as a carbon source rather than glucose. Transcription repertoire has to quickly adapt.
→ it recognizes an upstream activating sequence (UAS).
→ if Gal 4 binds to the UAS Gal, which is required to activate the transcription of these galactose specific genes downstream, we can use that same kind of promoter structure to test how Gal 4 actually does this.
→ Construct a Reporter-gene construct (a): minimal promoter, TATA box transcription site, UAS Gal upstream and downstream gene (any reporter). If Gal 4 interacts with UAS, it will activate transcription of the reporter. → Carry out gel shift analysis to see whether the GAL4 protein interacts with UAS. \+ means it can bind to the DNA - means it can't bind to the DNA \+++ very good transcription
Conclusion:
- Small numbers refer to positions in the wild-type sequence.
Deletion of 50 amino acids from the N-terminal end destroyed the ability of Gal4 to bind to UAS and to stimulate expression of β-galactosidase from the reporter gene. First 50 are crucial for the DNA binding which in turn is needed for transcriptional activity (DNA Binding domain).
- Proteins with extensive deletions from the C-terminal end still bound to UAS. These results localize the DNA-binding domain to the N-terminal end of Gal4.
- The ability to activate β-galactosidase expression was not entirely eliminated unless somewhere between 126 and 189 or more amino acids were deleted from the C-terminal end.
- Thus the activation domain lies in the C-terminal region of Gal4. Proteins with internal deletions (bottom) were also able to stimulate expression of β-galactosidase, indicating that the central region of Gal4 is not crucial for its function in this assay.
**The domains needed are DNA binding domain and transcriptional activation domain. Of course the transcription domain too is needed. **
Homeodomain proteins
- The homeodomain was named due to its presence in several transcription factors that give rise to homeotic transformations when mutated at particular residues – 60-residue DNA binding motif .
- Homeobox genes or homeotic genes were initially described in Drosophila. By introducing mutations randomly throughout the large population of animals, you can isolate mutants that incorrectly set up their body plan (where they put their body part).
Ex: Antennapedia → Drosophila grows legs out where the antennas should be caused by a mutation to the DNA binding transcription factor that was critical for expressing genes, important for placing the correct body parts in the correct place.
These are very important for specifying a number of developmental positions during the formation of the body plan, yet the DNA binding sites that they recognize are very similar (recognize the same CIS regulatory element, more or less). So how do you get the specificity??
Zinc Finger Protein or Zinc finger DNA binding transcription factors
The zinc fingers recognize specific trinucleotide DNA sequences by insertion of several a-helices in the major groove of the DNA.
Three different types:
1) C2H2 types usually contain three or more finger units and bind to DNA as monomers. Consists of 2 cysteine and two histidine that coordinate a zinc molecule. This gives rise to finger-like projections (in 2D perspective).
-most common DNA-binding motif in human genome (or vertebrates)
2) C4 types usually contain only two finger units and bind to DNA as homo- or heterodimers. They have 4 cysteine to coordinate a zinc molecule. ie…steroid hormone receptors (nuclear receptors).
3) The C6 Zinc Finger transcription factor is a variation wherein six cysteine metal ligands coordinately bind two Zn2+ ions.
The zinc fingers are important for interacting with the DNA sequences they recognize.
Leucine-Zipper Proteins
- Leucine zipper proteins bind DNA exclusively as homo- or heterodimers with their extended alpha-helices, which bind the major groove of the DNA.
- They contain a leucine or a different hydrophobic amino acid in every seventh position in the C-terminal region of the DNA binding domain… bZIP proteins. When looking at 2 monomers, you can see that the two leucine residues always line-up so that they can interact together —> forming a hydrophobic interface.
- These hydrophobic residues form a coiled coil domain, which is required for dimerization. By bringing those 2 monomers together to form a dimmer, you position some associated basic domains such that they can recognize their DNA targets in the major grooves.
- Therefore, this is dependent on hydrophobic residues that line up along the interface and allows these proteins to come together—> dimerized based on the hydrophobic interactions in order to position the DNA binding domains present in these regions.
**Turns out they do not have to be leucines. They can be any hydrophobic residue (must still be present in that position).
Helix-Loop-Helix
- helix-loop-helix (HLH) proteins are very similar to leucine zipper proteins, however instead of an extended alpha-helix they are characterized by two alpha-helices, which are connected by a short loop.
- You still have a dimerized interface that is important for positioning these recognition helices.- Only difference is the little loop between the two domains
- HLH proteins contain hydrophobic amino acids spaced at intervals characteristic of an amphipathic alpha-helix in the C-terminal region of the DNA binding domain.
Transcription factors of unrelated classes
Transcription factors of unrelated classes can also bind COOPERATIVELY
* Sometimes transcription factors dont work alone, but in fact work much better in a cooperative context.
* Protein-protein interaction favours formation and stability of the ternary complex (Cooperative DNA Binding - a complex formed between two substrate molecules and an enzyme)
An example of this: By themselves, both monomeric NFAT and heterodimeric AP1 transcription factors have low affinity for their respective binding sites in the IL-2 promoter-proximal region. Protein-protein interactions between NFAT and AP1 add to the overall stability of the NFAT-AP1-DNA complex, so that the two proteins bind to the composite site cooperatively
Transcription factor interactions
Transcription Factor Interactions Increase Gene-Control Options
* Combinatorial possibilities greatly extend the potential for diversified gene regulation.
* The combination of transcription factor binding sites in promoters leads to a** diversity of transcriptional responses.**
* Homo- and heterodimer formation is common among transcription factors (formed of monomers).
* All interacting with specific sites so that you greatly increase the complexity of your transcriptional output.
ie…three transcription factors that can homodimerize or heterodimerize => 6 different possible combinations
As seen in the image bellow, you get these interesting combinations of transcription factors binding to different sites, which is critical for gene expression. It is very unlikely that you have one transcription factor and one binding site that will effect gene expression of a downstream gene.