Chapter 11: Assembly of the Transcription Initiation Complex Flashcards
Module 8
The two stages in the process that leads from genome to transcriptome.
- Initiation of transcription
- results in assembly upstream of the gene of the complex of proteins
- includes RNA polymerase enzyme and its various accessory proteins
- Synthesis and processing of RNA
- begins when the RNA polymerase leaves the initiation region and starts to make an RNA copy of the gene
- ends after completion of the processing and modification events that convert the initial transcript into a mature RNA molecule
Module 8
The central players in transcription are ___-______ ______ that attach to the genome in order to perform their biochemical functions. Many recognize specific nucleotide sequences and bind predominantly to these target sites, whereas others bind nonspecifically
DNA-binding proteins
Module 8
requires that part of the protein penetrates into the major and/or minor grooves of the helix in order to achieve _____ _____of the sequence. This is usually accompanied by more general interactions with the surface of the DNA molecule, which may
- direct readout
- simply stabilize the DNA–protein complex or which may access the indirect information on nucleotide sequence
Module 8
DNA-binding motifs
- When the structures of sequence-specific DNA-binding proteins are compared, the family as a whole can be divided into a limited number of different groups on the basis of the structure of the segment of the protein that interacts with the DNA molecule
- present in a range of proteins, often from very different organisms, and at least some of them probably evolved more than once
Module 8
helix–turn–helix (HTH) motif
- first DNA-binding structure to be identified
- made up of two a-helices separated by a B-turn
- not a random conformation but a specific structure
- made up of four amino acids
- second one is usually glycine
- This turn, in conjunction with the first a-helix, positions the second a-helix on the surface of the protein in an orientation that enables it to fit inside the major groove of a DNA molecule
- This second a-helix is therefore the recognition helix
- makes the vital contacts which enable the DNA sequence to be read
- structure is usually 20 or so amino acids in length, so its a small part of the protein as a whole
- Some of the other parts of the protein form attachments with the surface of the DNA molecule to help position the recognition helix within the major groove
*
Module 8
helix–turn–helix (HTH) motif
examples
- lactose repressor: in bacteria
- homeodomain:
- in eukaryotic
- made up of 60 amino acids which form four a-helices, numbers 2 and 3 separated by a b-turn with number 3 acting as the recognition helix and number 1 making contacts within the minor groove
Module 8
Zinc fingers
- rare in prokaryotic proteins but very common in eukaryotes
- 1% of all mammalian genes code for zinc-finger proteins
- at least six different versions of the zinc finger
- versions of the zinc finger differ in the structure of the finger
- some lack the b-sheet component and consisting simply of one or more a-helices
- some differ on the precise way in which the zinc atom is held in place
- multiple copies of the finger are sometimes found on a single protein
- the individual zinc fingers are thought to make independent contacts with the DNA molecule, but in some cases the relationship between different fingers is more complex
Module 8
Zinc fingers
Cys2His2 finger
- first to be studied in detail
- comprises a series of 12 or so amino acids
- includes two cysteines and two histidines, which form a segment of b-sheet followed by an a-helix
- These two structures form the “finger” projecting from the surface of the protein, holding a bound zinc atom, between the two cysteines and two histidines
- a-helix is the part of the motif that makes the critical contacts within the major groove
- b-sheet determines a-helix positioning within the groove and interacts with the sugar–phosphate backbone of the DNA, and the zinc atom
- zinc atom holds the b-sheet and a-helix in the appropriate positions relative to one another
Module 8
Often the first thing that is discovered about a DNA-binding protein is not the identity of the protein itself but the features of the _____ ______ that the protein recognizes because many of the proteins that are involved in genome expression bind to _____ DNA sequences immediately _____ of the genes on which they act. Because of this, a number of methods have been developed for locating protein-binding sites working perfectly well even if the relevant DNA-binding proteins have not been identified.
- DNA sequence
- short
- upstream
Module 8
Gel retardation / gel shit analysis
- technique is carried out with a collection of restriction fragments that span the region thought to contain a protein-binding site
- two nuclear extracts have been mixed with a DNA
restriction digest - a DNA-binding proteing is added to one sample
- DNA-binding protein in the extract attaches to one of the restriction fragments
- this results in a DNA–protein complex
- has a larger molecular mass than the “naked” DNA
- runs more slowly during gel electrophoresis.
- the band for this fragment is retarded
- naked DNA and DNA–protein are run in separate wells
- DNA–protein sample is recognized by comparison with the banding pattern produced by naked DNA which runs faster
- A nuclear extract is used because at this stage of the project the DNA-binding protein has not usually been purified. If, however, the protein is available then the experiment can be carried out just as easily with the pure protein as with a mixed extract
Module 8
Gel retardation / gel shit analysis
drawbacks
- gives a general indication of the location of a protein-binding site in a DNA sequence, but does not pinpoint the site with great accuracy
- no indication of where in the retarded fragment the binding site lies
- if retarded fragment is long then it might contain separate binding sites for several proteins
- if it is quite small then there is the possibility that the binding site also includes nucleotides on adjacent fragments, not forming a stable complex with the protein and so do not lead to gel retardation
- Gel retardation studies are therefore a starting point but other techniques are needed to provide more accurate information.
Module 8
Modification protection assays
- pinpoint binding sites with greater accuracy
- basis of these techniques is that if a DNA molecule carries a bound protein then part of its nucleotide sequence will be protected from modification
- two ways of carrying out the modification
- treatment with a nuclease
- cleaves all phosphodiester bonds except those protected by the bound protein.
- exposure to a methylating agent
- such as dimethyl sulfate which adds methyl groups to G nucleotides
- Any Gs protected by the bound protein will not be methylated
- treatment with a nuclease
- DNA footprinting used with these assays
Module 8
DNA footprinting / Modification protection
- DNA fragment is labeled at one end
- achieved by treating a set of longer restriction fragments with an enzyme that attaches labels at both ends
- cutting these labeled molecules with a second restriction enzyme
- purifying one of the sets of end fragments.
- nuclease treatment is carried out under limiting conditions
- low temps/very little enzyme
- so each copy of the DNA fragment is cleaved at just one position along its length
- all bonds are cleaved except those protected by the bound protein
- carried out in the presence of a manganese salt, which induces the enzyme to make random, double-stranded cuts in the target molecules, leaving blunt-ended fragments
- protein is now removed, the mixture electrophoresed, and the labeled fragments visualized
- all fragments have labels at one end and a cleavage site at the other
- results in a ladder of bands corresponding to fragments that differ in length by one nucleotide, with the ladder broken by a blank area in which no labeled bands occur
- blank area/“footprint,” corresponds to the positions of the protected phosphodiester bonds, of the bound protein, in the starting DNA
Module 8
Modification interference
- identifies nucleotides central to protein binding
- not modification protection
- provides an extra dimension to the study of protein binding
- works on the basis that if a nucleotide critical for protein binding is altered, for example by addition of a methyl group, then binding may be prevented
Module 8
The dimethyl sulfate (DMS)
modification protection assay
- similar to DNase I footprinting
- Instead of DNase I digestion, the fragments are treated with limited amounts of DMS so that a single guanine base is methylated in each fragment
- Guanines that are protected by the bound protein cannot be modified
- Now the binding protein or nuclear extract is added, and the fragments electrophoresed.
- Two bands are seen, one corresponding to the DNA–protein complex and one containing DNA without bound protein
- The latter contains molecules that have been prevented from attaching to the protein because the methylation treatment has modified one or more Gs that are crucial for the binding
- To identify which Gs are modified, the fragment is purified from the gel and treated with piperidine,
- compound cleaves DNA at methylguanine nucleotides
- The result of this treatment is that each fragment is cut into two segments, one of which carries the label
- The length(s) of the labeled segment(s), determined by a second round of electrophoresis, tells us which nucleotide(s) in the original fragment were methylated and hence identifies the positions in the DNA sequence of Gs that participate in the binding reaction
- Equivalent techniques can be used to identify the A, C, and T nucleotides involved in binding
Module 8
It is now recognized that the nucleotide sequence also influences the precise _____ of each region of the helix, and that these conformational features represent a second, less direct way in which the DNA sequence can influence _____ _____
- conformation
- protein binding
Module 8
Direct readout of the nucleotide sequence
- although the nucleotide bases are on the inside of the DNA molecule, they are not entirely buried
- some of the chemical groups attached to the purine and pyrimidine bases are accessible from outside the helix
- Direct readout of the nucleotide sequence is possible without breaking the base pairs and opening up the molecule
Module 8
Direct readout
all 3 DNA forms
- B-form of DNA
- identity and orientation of the exposed parts of the bases within the major groove is such that most sequences can be read unambiguously
- whereas within the minor groove it is possible to identify if each base pair is A–T or G–C but difficult to know which nucleotide of the pair is in which strand of the helix
- A-form
- major groove is deep and narrow and less easily penetrated by any part of a protein molecule
- shallower minor groove is therefore likely to play the main part in direct readout
- Z-DNA
- the major groove is virtually nonexistent and direct readout is possible to a certain extent without moving beyond the surface of the helix.
Module 9
DNA-dependent RNA polymerases
- enzymes responsible for transcription of DNA into RNA
- three different RNA polymerases:
- RNA polymerase I,
- RNA polymerase II
- RNA polymerase III
- Each is a multisubunit protein (8–12 subunits) with a molecular mass in excess of 500 kDa
- all are structurally quite similar, but functionally quite distinct
- Each works on a different set of genes, with no interchangeability
- Each of the three eukaryotic RNA polymerases recognizes a different type of promoter sequence
Module 9
RNA polymerase I
transcribes the multicopy repeat units containing the 28S, 5.8S, and 18S rRNA genes
Module 9
RNA polymerase II,
- most researched
- transcribes
- genes that code for proteins
- snRNAs that are involved in RNA processing
- genes for the miRNAs
Module 9
RNA polymerase III
transcribes other genes for small RNAs, including those for tRNAs
Archaea possess a single RNA polymerase that is structurally very similar to the ______ enzymes
eukaryotic
Module 9
bacterial RNA polymerase
- has five subunits: α2ββ’δ
- 2α subunits
- one each of β and β’
- one of δ
- α2ββ’ are equivalent to the three largest subunits of the eukaryotic RNA polymerases
- δ has its own special properties
mitochondrial RNA polymerase
- consists of a single subunit with a molecular mass of 140 kDa
- more closely related to the RNA polymerases of certain bacteriophages
Module 9
Two ways in which RNA polymerases bind to their promoters
- Bacteria: direct recognition of the promoter by the RNA polymerase
- eukaryotic and archaeal:
- DNA binding protein binds to promoter forming a platform
- RNA polymerase binds to DNA binding protein
Module 9
prokaryotic promoter
- in bacteria, the target sequence for RNA polymerase
- immediately upstream of gene
- binding site for the RNA polymerase
- consists of two consensus sequences
- 6 nucleotides long
- names indicate their positions relative to the point at which transcription begins, starting point is labeled +1
- –35 box: 5’–TTGACA– 3’
- changes to sequence affect ability of RNA polymerase to bind
- –10 box: 5’–TATAAT– 3’
- changes to sequence affect the conversion of the closed promoter complex into the open form
- most are comprised mainly or entirely of A–T base pairs
- space between the 2 very important
- between 20 and 600 nucleotides upstream of the start of the coding region of the gene
- both are on the same face of the double helix
Module 9
Eukaryotic promoters
- more complex than prokaryotes
- describe all the sequences that are important in initiation of transcription of a gene
- numerous and diverse in their functions
- core promoter / basal promoter
- where the initiation complex is assembled
- upstream promoter elements
- lie upstream of the core promote
- Assembly of the initiation complex on the core promoter can usually occur in the absence of the upstream elements
Eukaryotic promoters
RNA polymerase I promoters
- core promoter spanning the transcription start point
- between nucleotides –45 and +20
- has an upstream control element (UCE) about 100 bp further upstream
Module 9
Eukaryotic promoters
RNA polymerase II promoters
- variable and can stretch for several kilobases upstream of the transcription start site
- consists of two main consensus segments
- –25 or TATA box: 5’–TATAWAAR–3’
- W is A or T
- R is A or G
- initiator (Inr) sequence
- mammalian: 5’–YCANTYY–3’
- Y is C or T, and N is any nucleotide
- located around nucleotide +1
- –25 or TATA box: 5’–TATAWAAR–3’
- Some genes have only one of these two components and some have neither, called null genes
- still transcribed
- start position for transcription is more variable
- A few genes have additional sequences
- DPE, downstream promoter element
- located at positions +28 to +32
- has a variable sequence
- binds TFIID, a protein complex that plays a central role in the preinitiation complex
- 7 bp GC-rich
- immediately upstream of the TATA box
- recognized by TFIIB, another component of the preinitiation complex
- PSE, proximal sequence element
- located between positions –45 and –60
- upstream of those snRNA genes that are transcribed by RNA polymerase II
- DPE, downstream promoter element