ROJW - gene-specific transcription factor Flashcards
Gene-specific transcription factors.
Many genes are only transcribed in particular cell types, and/or at certain times during development. This requires an added level of regulation separate from the basal transcription machinery – involve Gene-specific transcription factors.
Gene-specific transcription factors are only expressed in particular cell types, allowing for cell-specific transcription patterns
Regulatory regions
- In addition to transcription start sites (promoters, promotor-proximal elements), eukaryotic genes contain large regions that act as binding sites for gene-specific transcription factors (both proximal & distal)
- The sequence-specific binding of gene-specific transcription factors allows genes to achieve and maintain controlled levels of tissue-specific expression patterns
Eukaryotic gene-specific transcription factors are usually transcriptional activators
- repressors are frequently used in bacterial systems, but much rarer in eukaryotes (whose chromatin-packaged genome is ‘naturally’ in a repressed state). In prokaryotes, DNA is not packaged in chromatin.
Functional Elements of Gene-Specific Transcription Factors
Gene-specific transcription factors must be able to:
* specifically bind to a small subset of genes through sequence-specific DNA-binding domains
* After binding, parts of the TF must become active & utilize the activation domains to interact and modulate the activity of the basal transcriptional machinery (RNA polymerase and basal factors) to stimulate their activities
Modular Structures of Gene-Specific Transcription Factors
Contain:
* DNA binding domain - necessary for the recruitment of the gene-specific transcription factors to a sequence-specific subset of promoters (TF activity)
* Activation domain: required for stimulation of transcription (functioning activity)
o Activities found out by using genetic engineering method to generate deletion variances
DNA foot printing:
useful for mapping the binding site of TF within a DNA fragment
* Expose DNA with DNase 1 that will randomly cut labelled DNA, resulting in labelled DNA fragments of varying lengths
* DNase will not be able to cut DNA fragment at the location where the TF are bound, therefore, there will be a gap or a ‘footprint’ of missing fragments (absence of cleavage products) at those locations where the TF are bound to.
* Can start the experiment by identifying DNA fragments that are bound by DNA transcription factors using Electrophoretic Mobility Shift Assay (EMSA) - (deduce binding partners via EMSA)
* Once the fragment is identified, use foot printing to map specifically which region of the fragment is the binding site located
Physical Chemistry involved in DNA binding
- Electrostatic bonds (attract DNA and protein over long distances)
- Short distance interactions: come to play once DNA binding domain of protein has bound to DNA target site
o Hydrogen bonds
o Van der Waal forces
o Hydrophobic interactions - There is also a high degree of structural complementarity between DNA binding motif of the TF and DNA to maximize interaction surfaces & the short-range interactions
Electrostatic Interactions
- TF contain a lot of +charged residues in the DNA binding region to make electrostatic interactions with the -charged phosphodiester DNA backbone
TBP binds DNA sideways and DNA is kinked upon binding to TBP
- Electrostatic interactions provide stabilizing energy so DNA and protein can initially interact over long distances (initially bring DNA & protein together), but do not provide sequence specificity – only allows TF to bind in a non-sequence specific manner to DNA
- TF then forms more extensive contact as it binds along the DNA axis & then allows for: Sequence-Specific Interactions of TF with DNA
Sequence-Specific Interactions of TF with DNA
- Binding of transcription factors does not lead to unravelling of the DNA
- Transcription factors have to ‘read’ the nucleotide sequence from the outside of DNA while DNA remains in double helical form
Sequence-Specific Interactions of TF with dsDNA is able to happen because the DNA structure is in B form
B form DNA
(two polynucleotide chains winding into an antiparallel right-handed double helix)
* The sugar-phosphate backbone is on the outside, while the bases project into the interior in an asymmetric manner
* This results in B-DNA having two kinds of grooves:
o minor groove (6 Å wide)
o major groove (12 Å wide) – more exposed
- DNA is also a flexible molecule that can bend, twist, loop, etc.
- Therefore, TF can access easily access the DNA bp via either the major or minor groove
Base pair geometry
- Each base/ base pair possesses a specific number of hydrogen bond donor and acceptor groups that can be recognized by TF
In major groove:
In addition to differentiating between AT & CG bp, absolute recognition of the four different bases is also possible because of
* Asymmetric HB donor/ acceptor between base pairs
o A has both donor and acceptor group while T has only acceptor group
o G has only acceptor groups while C has only donor group
In minor groove,
HB donor/ acceptor pattern between base pairs is more symmetrical
* Both A and T have an acceptor group
* Both C and G have an acceptor group, along with a donor group on G that in the middle of the base pair – but it’s a small distance indistinguishable by TF to tell whether the donor group is actually on G or C
Therefore a small amount of TF that bind via minor groove can distinguish AT bp from GC bp BUT cannot distinguish A from T or C from G (no absolute recognition of bases)
Ex. protein side chain forming H bond with DNA base – how it’s able to distinguish each bp
Structures of Gene-specific Transcription Factors:
- DNA-Binding Domains
Helix-turn-helix motif
Leucine-Zipper Domains
Zinc Finger Domains
- DNA-Binding Domains
- DNA binding motifs are shared by many TF in the human genome; therefore, there are only a few types of DNA binding motifs
o Helix-turn-helix motif
o Helix-loop-helix motif
o Zinc-finger motifs - All those motifs still use the ⍺-helix in major groove method – position one of the ⍺ helices in the motifs to bind to major groove of DNA
- DNA binding motifs are shared by many TF in the human genome; therefore, there are only a few types of DNA binding motifs
o Helix-turn-helix motif
o Helix-loop-helix motif
o Zinc-finger motifs - All those motifs still use the ⍺-helix in major groove method – position one of the ⍺ helices in the motifs to bind to major groove of DNA
Helix-turn-helix motif
- used in many bacterial transcription factors (ex. lac repressor, CAP protein, lambda repressor (bacteriophage))
- also found in important eukaryotic transcription factors (less frequent in eukaryotes) (ex. TF of Homeotic genes – specifically expressed during embryogenesis and determine regional differentiation along the body axis)
Helix-turn-helix motif structure
2 alpha helices separated by a turn (short irregular structure of amino acid sequence)
* One of the helices is used as sequence specific recognition helix
o is fitted into the major groove of DNA along the helical axis
o Side chains protruding from the helix makes base specific contacts with target DNA sequence
* Remainder of the motif is for structural functions – positions the sequence specific recognition helix & sometimes used to make electrostatic contacts along nearby regions to help stabilize DNA binding to the sequence recognition motif
Helix-turn-helix motif usually form dimers & recognize palindromic DNA sequences
sequences (reverse complement sequences/ inverted repeats) – indication that the target site is recognized by a dimerized DNA binding motif
* The protein dimerizes & binds DNA in a way that the sequence recognition motif is inserted in the major groove & is positioned to recognize the specific DNA sequence
* More specificity using dimerization because require sequence specificity at 2 locations
Leucine-Zipper Domains
- From a structural perspective, the leucine-zipper motif = the simplest DNA-binding motif
- Y-shaped DNA binding domain with 2 ⍺-helices held together by regularly-spaced leucine residues forming the stem (the leucine zipper).
- The diverging terminal part of the ⍺-helices act as the DNA binding domain since they can insert themselves into different major grooves of DNA molecule in a sequence specific manner (leucine zipper region isn’t involved in DNA binding)
- The DNA binding region is stabilized by positively charged arginine and lysine side chains (to form electrostatic interactions)
Several important transcription factors controlling cell proliferation contain a ‘leucine-zipper’ DNA-binding domain
o Ex. c-JUN/c-FOS heterodimer
o c-JUN and c-FOS are encoded by separate genes that have been identified as oncogenes (can convert a normal cell into a cancer cell if mutated)
o both contain ⍺ helices region that come together as a leucine zipper motif (the 2 ⍺ helices region interact via regularly spaced leucine residues)
o diverging ends of ⍺-helices bind the DNA major groove
Helix-Loop-Helix Domains
- a slightly more complex version of the ‘leucine zipper’ motif
- consist of non-continuous, interrupted helices forming a Y shape - its the leucine zipper portion is separated from DNA binding portion by a loop
- the loop switches the DNA binding portion from one side to the other
- Other oncoproteins controlling cell proliferation contain a helix-loop-helix (‘H-L-H’) DNA-binding domains
o this includes important oncoproteins, such as MYC and MAX
Zinc Finger Domains
- made from a β-turn and an ⍺-helix held together by a zinc atom that is coordinated by cysteine and/or histidine residues
- Only found in eukaryotes, not in bacteria
- One of the most widely used DNA-binding motif
o BUT: some zinc fingers-containing proteins also use them to bind RNA or protein – presence of Zinc finger motif makes it likely to be TF but may also be protein involved in other cellular processes
- Structure of Zinc Finger motif
creates a very compact, small DNA binding domain which is usually difficult to achieve
- The a-helix makes sequence specific interaction with major groove DNA
- The small size of the a-helix allows for recognition of only 3 nucleotides
- Therefore, zinc finger domain is usually organized in tandem - this allows more complex DNA sequences to be recognized by arranging Zinc-fingers with different sequence recognition ability in tandem
o Utilized by biotechnology to design artificial TF that bind to predetermined target site (by selecting/ organizing the tandem Zinc Fingers with diff. sequence recognition in a specific order to be able to recognize the specific target site)
DNA binding domains & Cancer
p53 DNA binding domain and Carcinogenic Mutations
* p53 is a tumor suppressor gene which, in response to DNA damage, slows progression through the cell cycle and initiates apoptosis if damage is severe
* p53 acts as a transcription factor and has a complex 3D central domain which binds DNA in a sequence specific manner. This domain has no structural similarity to any other known DNA binding motif
o contains beta sheets at the back and alpha helices that makes contact to major groove of DNA
o Mutations of p53 can cause cancer (50% of cancers contain a p53 mutation). Most of the p53 mutations that cause cancer are found in the DNA-binding domain
Problem with complex DNA binding motif
- = any small mutagenesis in the sequence can easily disrupt the motif
o In the case of p53, mutation in the DNA binding motif stops p53 from binding to DNA & cause problem because it is not able to suppress cancer
- Activation Domains
- Carry out function to stimulate the activity of the basal transcriptional machinery
- Most GSTF contain multiple Activation domains
activation domains structures
The activation domains have unusual properties at the primary amino acid level - have unusually high frequency of particular amino acid residues (indicative characteristic of intrinsically disordered proteins)
o like CTD of RNA pol2 – 7 amino acid repeats of serine, threonine, proline = highly repeated = intrinsically disordered – cant take up 3D structure
activation domains structures (different organisms)
- Yeast: high frequency of E & D – acidic (-charged) amino acid residues – repel each other & cant fold up in compact structure
o BUT these -charged residues are NOT the ones important for activating the GSTF activation domain function, but are the bulky hydrophobic residues (W & F) that play an important role in the activating function- mutations in those residues = lose activating function
o The acidic residues are there just to make sure the region doesn’t fold into regular 2ndary structures - Human: high frequency of Q regularly spaced & in clusters - associated with disordered proteins
- Drosophila: I – hydrophobic amino acid
The highly repeated aa are there just to ensure flexibility of the activation domain of the eukaryotic GSTF structure – to stay intrinsically disordered without defined 3D structure
* enable binding to a number of transcription factors (basal TF or other GSTF) without being restricted by structural complementarity – give versatility
* allow for complex regulation by facilitating non-structured binding interactions and thus integrating diverse chemical processes
Functions of activating domain
- aid the recruitment of various basal factors
o by binding to basal TF & stabilize the basal transcription during initiation - stimulate various enzymatic activities in the basal machinery
o ex. stimulating TFIIH helicase activity to create transcription bubble (to get higher transcription initiation) & stimulating TFIIH kinase activity to create elongation-competent RNAPII - help to create ‘open’ chromatin domains
Overall model of transcription factors (basal & gene-specific) involved in initiation & stimulation
- Basal transcription factors bind to TATA box/ promoter-proximal regions
- GSTF bind to proximal & distal regions via DNA binding domain
- Activation domain of GSTF & basal TF interact w each other directly/ via mediators by looping of DNA to stimulate basal transcription machinery (both initiation & elongation properties of RNA pol2)