Transcription Mechanisms Flashcards
What is the structure of the bacterial RNA polymerase?
Single RNA polymerase in bacteria that makes all 3 types of RNA
Four kinds of subunits: a, b, b’, w
Subunit composition of enzyme is a2bb’w
What is the structure of the eukaryotic RNA polymerase?
3 different types of polymerases produce all 3 types of RNA
RNAPI transcribes 5.7S, 18S and 28S rRNAs
RNAPII transcribes all mRNAs and some snRNAs
RNAPIII transcribes 5S rRNA, all tRNAs and other small RNAs
All 3 work in the nucleus
Division of labour by these polymerases
Mitochondria has an additional polymerase (mtRNAP)
Due to origin of mitochondria - has its own genome
Compare bacterial core RNAPs and eukaryotic RNAPIIs
Bacterial RNAP is simple: 5 subunits
Eukaryotic RNAPIIs is more complex: 12 different subunits
Similar core structure
Homologous subunits
RPB1 and RPB2 are similar to b and b’
RPB3, RPB10, RPB11, RPB 12 are similar to a2 homodimer
RPB6 is similar to w Subunit
Additional subunits in eukaryotes are at the periphery (extensions/small peptides)
Magnesium marks where the active site is
Key catalysis in bond formation
In same place in both polymerases
What molecule marks where the active site is in polymerases?
Magnesium
How do bacterial RNA polymerases find the transcription start site?
Bacterial RNA polymerases have sigma factor that can recognize promotors without the help of any other transcription factors
RNAP + s Factor (a2bb’ws) is called the holoenzyme (initiation form) which is responsible for initiation of transcription
Can melt the promotor in the absence of ATP and can maintain an open promotor complex for days without transcript initiation
-10 motif is found in bacterial promotors
-10 motif is AT rich but NOT functionally equivalent to eukaryotic TATA box
-10 responsible for formation of transcription bubble (but TATA box stays double stranded)
How do eukaryotic RNA polymerases find the transcription start site?
Eukaryotic RNA polymerases don’t have any sequence specific binding activities even though they have more subunits (NO s Factor)
Require additional factors (basal factors)
They recognise promotor elements (TFIIA, TFIIB, TFIID)
And are responsible for formation of transcription bubble around transcription start site (TFIIE, TFIIH)
Explain the roles of the +1, -10 and -35 positions
+ 1: position where initiation occurs, first nucleotide in RNA molecule (also called lnr)
There is no position 0
-10 and -35 (only in prokaryotes): where the holoenzyme binds, conserved sequences
Transcription occurs left to right
Downstream: in transcribed region (+ve)
Upstream: in untranscribed region (-ve)
What is the initiation complex in eukaryotes and how is it regulated by TFs?
Basal factors and RNAPII are required to initiate transcription in eukaryotes
Initiation complex is the same for every RNAPII transcribed gene and required at every promotor
Initiation complex gives basal / low levels of unregulated transcription
Initiation complex is programmed by gene-specific transcription factors to give regulated levels of gene expression (higher levels of expression)
Basal initiation complex: at transcription start site, RNA polymerase is recruited by basal factors
Proximal gene specific transcription factors: promotor within 1kb
Distal gene specific transcription factors: enhancers (may be very far away so difficult to tell if a gene is regulated by this enhancer)
What is the TATA-box?
Is found in promotor region
Upstream of transcription start site at -25 to -30
AT rich sequence that is surrounded by GC rich sequences
Conserved motif
Is recognised by TFIID
What is TFIID?
Multiprotein complex
Consists of TATA binding protein (TBP) and many other subunits called TBP associated factors (TAFs)
TBP is autonomous
It can bind sequence specifically to TATA box on its own without TAFs
Is NOT found in bacteria
One protein that consists of 2 halves that are structurally similar and evolved by duplication
TBP binds TATA box DNA and bends it by 60-90 degrees
What are TFIIA and TFIIB?
TFIIA and TFIIB stabilize TFIID binding
TFIIA and TFIIB bind on opposite sites of TBP at the same time (they don’t compete for the same binding site)
TFIIB Recognition Element (BRE): TFIIB has sequence specific contact points on either side of the TATA box (predominantly on upstream side)
Occurs because DNA is bent
Increases sequence specificity of complex
What is an electrophoretic mobility shift assay (gel shift assay)?
Method of detecting transcription factor complexes
Binding of a TF causes tagged DNA to shift in mobility
Native gel electrophoresis
Only DNA: moves to bottom of gel
TBP/DNA complex: faint band because only small amount is bound. Energetically unfavourable due to bending of DNA
TBP/TFIIB/DNA complex: stronger band because TFIIB stabilizes DNA/TBP complex and increases sequence specificity
RNAP/TBP/TFIIB/DNA complex moves the least in the gel as it is the heaviest complex
What other motifs can be present other than the TATA-box?
Not all RNAPII transcribed genes contain TATA-boxes (80% don’t have a TATA box)
Two other motifs that may also be present
Initiator element (IE): found at TSS at +1, recruits initiation complex to promotor
Downstream promotor element (DPE): downstream of TSS (+28 - +34), very variable sequence
TBP associated factors (TAFs) of TFIID recognize these additional sequence elements
TFIID can bind even if TATA box isn’t present
Proximal promotor elements: TATA, IE (lnr), DPE
What is TFIIF?
TFIIF binds to RNAPII and facilitates delivery of polymerase to TFIID-TFIIB-DNA complex on promotor
What is TFIIE and TFIIH?
TFIIE and TFIIH are responsible for 3 critical functions in transcription:
Phosphorylation of RNAPII (contain a kinase) to make RNAPII elongation competent
Promotor melting via DNA helicase mechanism (contain a helicase)
Promotor clearance
Why is phosphorylation of RNAPII CTD important?
C-terminal domain of RNAPII consists of repeats of a 7-residue sequence YSPTSPS (26 in yeast, 52 in human)
CTD is phosphorylated by TFIIH kinase
Proline = destroys secondary structure = CTD is an unstructured protein
Threonine, serine, tyrosine = have an OH group in side chain, where phosphorylation occurs
Hypophosphorylated (low levels of phosphorylation) = CTD is associated with initiation complex
Hyperphosphorylated (high levels) = CTD with elongation competent RNAPII
Allows 5’ capping, assembly of spliceosomes, binding of cleavage/polyadenylation complex
What is promotor melting?
TFIIH is an ATP dependent helicase
Responsible for formation of transcription bubble by melting the double stranded DNA around the TSS
Melted promotor is intrinsically unstable: half life of 45 seconds
If transcription initiation doesn’t occur within this time span, then melted promotor configuration can only be sustained by ATP hydrolysis
Control point for regulating rate of initiation transcription - transcript initiation can only occur if transcription bubble is present
What types of TFs are more common in Bacteria/Eukaryotes?
Bacteria: repressors are more common because most chromatin is in open state (ex. Lac repressor)
Eukaryotes: repressors are less common/activators more common because chromatin is naturally in the repressed state
Need 100s of TFs to regulate one gene: more complex gene regulation in eukaryotes than in bacteria
What is DNA footprinting?
Gene specific TFs recognize a specific target sequence and bind to promotors
Binding regions can be mapped using DNA foot printing
DNase I without TF : DNA + limited amount of DNase I (nuclease) –> creates random breaks in the DNA fragment averaging once per strand –> run on gel to create a ladder of fragments
DNase I with TF bound: No cuts are made where TF has bound, protects DNA from being degraded, creates a footprint in the ladder
What is the function and structure of TF’s?
Gene specific TFs must be able to:
Specifically bind DNA through sequence specific DNA-binding proteins
Modulate activity of promotor bound transcriptional machinery (RNA polymerase and basal factors)
Structure: Activation domains + a DNA binding domain
Explain the interactions of the DNA binding regions and TFs
DNA and proteins interact specifically via a range of interactions (non-covalent)
Electrostatic bonds
Hydrogen bonds
Van der Waal forces
Hydrophobic interactions
High structural complementarity to maximize interactions
DNA binding domains are folded, 3D structure is complementary to DNA
Explain the electrostatic interactions and sequence specificity between DNA and protein/TFs
Electrostatic interactions between DNA and protein
DNA has a negative phosphodiester backbone
Surface of protein is positive
Electrostatic interactions between proteins and DNA backbone provide stabilizing energy (charges will always be the same)
But does not provide sequence specificity
Binding of TFs does not lead to unravelling of DNA (but may bend it)
Explain the B-form of DNA
DNA consists of 2 polynucleotide chains, antiparallel, right handed double helix
Sugar phosphate backbone is on the outside, bases are on the inside
B-DNA has 2 grooves: Major and minor groove
TFs bind to the major groove
A particular sequence of base pairs can be accessed either via the major or the minor groove
Groove depends on the rotation of DNA
Why is TF binding to the major groove essential in terms of base pair geometry?
In an A:T base pair the minor groove has 2HB acceptor sites and the major groove has 2HB acceptor and 1HB donor site
TF have limited specificity when identifying DNA from the minor groove
Because they can’t distinguish A:T from T:A (there are 2 HB acceptor sites)
But can distinguish A:T from G:C because G:C base pair has a different pattern of donor and acceptor bases than A:T
Major groove has full specificity
T is the only base that has a methyl group which is easily identified by TFs
What groove does the TATA box bind to?
TATA binding protein binds via the minor groove so it is not highly sequence specific, only looks for an AT rich sequence
TATA box sequence is not highly conserved
What are common DNA binding motifs?
Helix turn helix motif (+homeodomain variants), helix-loop-helix motif, zinc-finger motif
Helix turn helix domains / home-domains structure
Two alpha-helices separated by a loop
C-terminal helix called the recognition helix binds to the major groove, responsible for specificity
N-terminal helix stabilises the structure
Other helices in the protein make no contacts with the DNA (except some electrostatic interactions)
Helix turn helix domains function
H-T-H motifs are used in many bacterial TFs (ex. Lac repressor, CAP protein, l repressor
Also found in eukaryotic TFs, called homeodomains
Eukaryotic H-T-H domains / homeodomains
Discovered in drosophila
Homeotic genes/homeoproteins expressed during embryogenesis and determine regional differentiation along body axis
Regional expression patterns - define specific regional identify of different parts of the embryo
Exist in humans specifically along the spinal cord
Leucine-zipper domains structure
Leucine zipper is the simplest motif in terms of structure
C-JUN/c-FOS heterodimer formed by two long intertwined alpha helices arranged in a Y shape
Alpha helices are held together by hydrophobic interactions between regularly spaced leucine residues on both helices (leucine is very hydrophobic)
Binding of C-JUN/c-FOS to DNA is stabilized by positively charged arginine and lysine side chains
There are intrinsically disordered regions of the motif that can’t be identified by X-ray crystallography
Motif binds to major groove of DNA via sequence specific interactions
Adjacent TFs are also important for stability
Where are leucine Zipper domains found?
TFs controlling cell proliferation contain a leucine zipper DNA binding domain
C-JUN/c-FOS heterodimer
C-JUN and c-FOS are encoded by separate genes that have been identified as oncogenes (convert normal gene into cancer causing gene)
Helix-loop-helix domains function and structure
Helix-loop-helix domains are found in other oncoproteins controlling cell proliferation (ex. MYC and MAX)
MAX forms homodimeric complex (MAX/MAX) that contains a helix-loop-helix motif
H-L-H is a more complex version of leucine zipper motif
Includes a loop interrupting the alpha helices
Binding of alpha helixes to form Y shaped domain still involved hydrophobic interactions between regularly spaced leucine residues