Transcription in eukaryotes Flashcards
How does transcription termination occur in eukaryotes

Tissue specific control
- So, if we take him as are eukaryote
- He will have red blood cells which will make loads of beta globin
- he will have muscle cells that produce actin and myosin
- He will also have neurons which will make neuropeptides
- But essentially these cells contain all the same DNA
- They contain the exact same DNA content there is no difference in these cells in terms of DNA, yet these cells are very different both structurally and functionally

Why are cells genetically the same but different structurally and functionally?
- Make sure right genes are switched on in the right cells at the right time
- But essentially these cells contain all the same DNA
- They contain the exact same DNA content there is no difference in these cells in terms of DNA, yet these cells are very different both structurally and functionally
- Neurons
- Cells differentiate due to transcription of different genes e.g. actin in muscle
- Eukaryotes need to make sure right genes are transcribed in right cells known as tissue specific control – the ability to switch on genes in right cell type
Transcription control is exerted at 4 main levels, what are these?
- Binding of RNA polymerase: promoters and transcription factors
- Long range control: locus control regions
- Chromatin remodelling: histones and histone deacetylases
- DNA methylation: CpG islands and imprinting
When do promoters differ?
What does this help with?
Promoters for different genes are different
Each contain a combination of sites to which specific protein factors bind
All of these factors help RNA polymerase to bind in the correct place and to initiate transcription
The eukaryotic system is complex, what are the types polymerase and what genes do they transcribe?
Transcription still involves RNA polymerase
In eukaryotes however there is not just one, there are three:
- 1 transcribes the ribosomal RNAs
- 2 the mRNAs
- 3 the tRNAs
All genes that are transcribed and expressed via mRNA are transcribed by RNA polymerase II. Has a Zn binding site for DNA to bind to, has 12 subunits instead of 5

Tell me about the structure of eukaryotic RNA polymerase II
- Similar structure to Bacterial Polymerase
- Larger - 12 subunits instead of 5
- Unlike Bacterial Polymerase, it cannot initiate transcription - no sigma factor
- Requires many transcription factors
- Has to deal with DNA packed in nucleosomes

What do transcriptional activators help attract?
RNA polymerase II to the promoter, which helps to regulate rate and tissue specificity of gene expression
Other proteins control unwinding of chromatin to allow access for transcription factors
How do proteins control gene expression?
- They do this by binding to DNA
- In the major groove of the double helix
- The same in both prokaryotes and eukaryotes

Structure of eukaryotic promotors (recognised by RNA pol II)
- The promoter of eukaryote looks like this
- Divide into 3 parts
- Core promoter followed by upstream sequence element and an enhancer
- Start by talking about core promoter region
- This is where transcription begins
- Start point arrow like bac plus 1 and usually A like bac
- Region around state highly conserved and usually pyrimidine rich
- The TATA Box located approximately 25 bp upstream of the start-point of transcription is found in many promoters. The consensus sequence of this element is TATAAAA (so it resembles the TATAAT sequence of the prokaryotic -10 region but please do not mix them up). The TATA box appears to be more important for selecting the start-point of transcription (i.e. positioning the enzyme) than for defining the promoter.
- The Initiator is a sequence that is found in many promoters and defines the start point of transcription.
- The GC box is a common element in eukaryotic class II promoters. Its consensus sequence is GGGCGG. It may be present in one or more copies which can be located between 40 and 100 bp upstream of the start point of transcription. The transcription factor Sp1 binds to the GC box.
- The CAAT box - consensus sequence CCAAT - is also often found between 40 and 100 bp upstream of the start point of transcription. The transcription factor CTF or NF1 binds to the CAAT box.
- In addition to the above elements, Enhancers may be required for full expression. These elements are not part of the promoter per se. They can be located upstream or downstream of the promoter and may be quite far away from it. The mechanism by which they work is not known. They may provide an entry point for RNA polymerase or they may bind other proteins that assist RNA polymerase to bind to the promoter region.

Tell me about Core promotor TATA box
- General transcription factors for RNA Pol II (TFII)
- Position RNA Pol II, separate DNA - initiation
- Release RNA Pol II from promoter – elongation
- Needed for all genes
TFII: transcription factor for RNA pol II

What happens with the core promotor TATA box first?

What happens with the core promotor TATA box second/next?

Tell me about the structure of the PIC?
- Pre- initiation complex (PIC) is assembled
- Elongation
- TFIIH
- 9 subunits: ATPase, Helicase, Protein kinase

TFII and elongation

Tell me about elongation and the TFIIH central?
- C-Terminal domain (CTD) phosphorylated
- Conformation change – tightens grip
- General TFs dissociate
- Acquires new proteins – including elongation factors that help process the RNA and increase elongation rate

Formation of RNA polymerase II pre-initiation complex

The core promoter- TATA less promoter has what?
Where are these located?
- Have an INR (initiator) and DPE
- DPE is a downstream promoter element
- Located +28 to +32 (3’ relative to the start site)
- DPE have the sequence AGAC
- Recognised by TFII I

Structure of eukaryotic promotors (recognised by RNA pol II)
- The binding of PIC is however week just like in bac activators
- It needs other proteins to help it stabile bind
- Upstream bund interacts with PIC and stabilise interaction just like in bacteria
- These sequences are known as use
- The GC box is a common element in eukaryotic class II promoters. Its consensus sequence is GGGCGG. It may be present in one or more copies which can be located between 40 and 100 bp upstream of the start point of transcription. The transcription factor Sp1 binds to the GC box.
- The CAAT box - consensus sequence CCAAT - is also often found between 40 and 100 bp upstream of the start point of transcription. The transcription factor CTF or NF1 binds to the CAAT box.
- In addition to the above elements, Enhancers may be required for full expression. These elements are not part of the promoter per se. They can be located upstream or downstream of the promoter and may be quite far away from it. The mechanism by which they work is not known. They may provide an entry point for RNA polymerase or they may bind other proteins that assist RNA polymerase to bind to the promoter region.

How do Upstream sequence elements affect transcription?
Transcription can be enhanced by the binding of transcription factors to sites upstream of the PIC

Tell me about upstream sequence elements- growth hormone deficiency?
- Growth hormone (GH) is required for normal growth
- GH deficiency results in reduced growth - 1 in 5000 infants
- In 1990, deficiency was found to be due to a mutation in Pit-1 transcription factor

Tell me some and also provide an explanation about some upstream sequence motifs
1. Motifs bound by general transcription factors
e.g. the general TF, Sp1 binds to GGGCGG
Sp1 is found in all cell types
2. Motifs that confer tissue specific expression
e.g., MyoD binds to CANNTG (N=any base)
MyoD is a muscle-specific transcription factor
Note all cells have CANNTG but only tissue specific cells have the MyoD TF expressed.
3. Motifs that confer response to particular stimuli
e.g., Oestrogen receptor binds to AGGTCANNNTGACCT
We now know a lot about tfs and sequences they bind to
What we now know is that there are sequences or motifs that are bound by general tfs
For example, the tf sp1 which binds the sequence GGGCCGG, sp1 is a tf found in all cell types
The gene needs GGG at its 5 prime end and sp1 binds here to ensure a high level of transcription
Myod only in muscle no other cell types
Gene has sequence more rna produced if estrogen is present
Typically, rna pol 2 will have 1 or 2 of these sequences in the first 100 bp up stream of transcription start site - general or cell specific or respond to stimuli
Sequence determines wot tfs bind and when and wot cell types gene switched on and also in responseto hormones etc.
What are enhancers?
Regulatory sequences that act at a distance

How do enhancers work?
Where were they first discovered?
- Transcriptional activators bind
- Help RNA Pol II bind
Discovering first enhancer:
- Simian Virus 40 (SV40) – promoter
- found that the deletion of a 72 bp sequence led to a 100-fold
- decrease in expression
- The first enhancer was found in a virus
- If you take away a sp1 site, you get a 2-fold decrease in expression whereas this is a massive decrease – a major effect
- Hence, they called it an enhancer

Properties of enhancers elements
- They can activate transcription when placed thousands of bp away from the TATA box
- They act in either orientation
- Can act when placed upstream or downstream of the TATA box, or when placed within an intron

Mechanism of enhancers action
- Enhancers are sequences of DNA to which a large number of transcription factors bin
- 8 proteins and 6 different subunits. Bind to different region in the 72 BP sequence

The structure of eukaryotic promotors (recognised by RNA Pol II)

What was the first enhancer discovered?
What is an enhancer?
Enhancer is a sequence to which a large number of transcription factors bind
Simian Virus 40 (SV40), this was the first enhancer discovered

Give an example of a cellular enhancer
Immunoglobulin enhancer which is a cell specific enhancer
Immunoglobulin enhancer:
- What does the gene encode for?
- Where was this enhancer found?
- How many binding sites does this enhancer have?
- What are the binding sites and whats their specificity?
- Example of an enhancer region within the intron 200bp
- Within the immunoglobulin gene that encodes for antibody proteins
- Enhancers have been found in cellular genes
- The first enhancer was found within the immunoglobulin gene, the gene that encodes for the antibody proteins
- This enhancer was found within the intron of the gene
- The enhancer above contains binding sites for 8 transcription factors
- E1-E4: B cell specific transcription factor
- C1-C3: General transcription factor
- Oct: B cell specific Transcription factor
- High level antibody production in B cells

What do transcription factors determine?
Whether transcription occurs
Also determine cell specificity
Why is little known about TFs?
They have a low abundance
Difficult to purify
When were TF first isolated and by who?
How did they do it?
In 1986 Tijan isolated Sp1 by DNA affinity chromatography
- Hela cervical cancer cell line
- Lysed cells
- Run through column
- Beads with sp1 binding sequence
- Sp1 binds cpg rich seq
- And wash off other proteins
- Raise salt conc elute off sp1
- Adapt sequence for any transcription factors

What are TF made up of?
Amino acids; hence their 3D structure are important to their function
Tell me about the modular structure of TF?
Provide an example
Modular structure
- one region binds DNA, another region binds to other components
e. g., Oct-2 an octamer transcription factor, specific for B-cells
The members of this subfamily (designated Oct-l, Oct-2, Oct-3, etc.) recognize an evolutionarily conserved octanucleotide sequence in the vertebrate promoter and enhancer elements (5′-ATGCAAAT-3′).
These include a DNA-binding POU homeodomain flanked by two transcriptional activation domains
All the aas for DNA binding in 1 region

What are two domains that TF can have?
- DNA binding domain (all TFs have this)
- Activation domains or inhibitory domains

What are the 3 types of DNA binding domains?
- Zinc fingers
- Helix turn helix
- Basic binding domains

Tell me about zinc finger domain structure?
What their involved in?
Structure
- Contains a loop of 23 aa
- Usually have multiple zinc fingers
- The linker between the fingers is 7-8 aa
- Zinc tetrahedrally linked between 2 cystines and 2 histidine’s coordinated
Forms
- Forms loop and top protrudes into DNA major groove
Involvment in
- Multiple zinc fingers involved in binding the specific DNA sequence.
- Zn2+ ion does not directly interact with the DNA but is essential for the folding of the finger.
Binds to
- Zinc fingers bind both to the major and minor grooves

Give 2 examples of zinc fingers containing TFs?
- Steroid hormone receptor- Cys2-Cys2 fingers
- Vitamin D receptor (VDR)
Steroid hormone receptors
- What are they synthesised in response to?
- What do they exert?
Steroid hormones are synthesised in response to a variety of neuroendocrine activities
They exert major effects on cell growth, tissue development and body homeostasis
Give some examples for VDR?
- Glucocorticoid receptor (GR)
- Oestrogen/ estrogen receptor (ER)
- Androgen receptor (AR)
*
What two things make up an active transcription factor?
Substrate + Receptor –> Active transcription factor
What % homology do DNA binding domains have?
97-43%
What is Zinc coordinated by?
2 cysteins on left and two on right
VDR

Tell me about the structure of helix turn helix DNA binding domains
- Homeodomains are 60 aa and contain 3 helices
- The C terminal alpha helix- 3 is 17aa and lies
- in the major groove
- Helices 1 & 2 point away from the DNA

What can TF with basic binding domains not bind to?
They cannot bind to DNA alone they must dimerise

What do Leucine zipper proteins bind to?
They bind DNA exclusively as homo- or heterodimers with their extended alpha helices which bind the DNA’s major groove
What do Leucine zipper proteins contain?
A leucine or a different hydrophobic AA in every 7th position in the C-terminal region of the DNA binding domain
These hydrophobic residues form a coiled-coil domain which is required for dimerisation

How do transcription factors activate transcription?

What are activation domains?
A region of the TF protein involved in activating transcription

Domain swap experiment

What are the different types of activation domains?

How do TF activation domains work?

How do TF activation domains work?

What do TFs recruit to modify histones?
What types of histones are we referring to?
Histones: H2a, H2b, H3 and H4
positively charged DNA wrapped around the positively charged histones
What is the role of Histone acetylase (HAT)?
- Acetylates N-terminal tail lysine of histone units
- Neutralizes +ve charge of histone
- TF has access to core promoter now as histone and DNA interaction becomes less
- Opens up DNA
- Allows TFs and RNA Pol II to get to the DNA

What are the 2 domains that histone have?
What are they rich in?
How can they be modified?
Histones have 2 domains
- globular domain
- amino tail domain
very rich in lysine’s
can be chemically modified thus changing charge and therefore interacts, opening up the DNA/histone complex.
Name a co-activator which TF can recruit in order to modify histones?
p300/CBP
How does p300/CBP work as a co-activator?
- Has histone acetyl transferase activity (HAT)
- it will acetylate: H3, H4, H2A, H2B
What are inhibitory domains?
A region of the transcription factor protein involved in repressing transcription

How do TF inhibitory domains work?
a)Bind to DNA and block TFs with activator domains from binding
b)Bind to PIC and block transcription with its inhibitory domain
(a) and (b) act by getting in the way
c) Through the recruitment of co-repressors
i. Co-activators work by interacting with the PIC
ii. Closing / tightening chromatin structure
What can co-repressors alter?
How does it do this?
Chromatin structure
- It does this by modifying histone
- Histone de-acetylase (HDAT)
- Removes acetyl group of histone units
- restores +ve charge of histone
- Close down DNA
- Shutting off transcription

Core promotor recognition and pre-initiation complex assembly

Target transcription for new cancer treatments

Remdesivir

Summary from lecture 12
RNA polymerase II
- Transcribes mRNA proteins (larger than bacterial)
Promoter
- Core promoter (TATA box)
- Upstream sequence elements
- Enhancers
Transcription factors
- DNA binding, activator, repressor domains
Controls
- Tissue type
- Developmental stage
- Response to environmental condition