HN series Flashcards
whats is a gene?
a stretch of DNA that encodes diffusible products (RNA/proteins), which in turn carry out functions in the cell
how human genes identified
by:
- similarity to known genes
- expressed sequence tags (ESTs)
- in silico prediction
what is in silico prediction
computation prediction using known feature of the genes
what is a promoter
~100 bp, contains dispersed sequences that bind basal transcription apparatus
what is enhancer
~100bp, contains several closely arranged binding sites for transcription factors
e.g of regulatory regions
promoters and enhancers.
what is the aim of regulatory regions such as promoters and enhancers used for
alternative promoters/enhancers maybe employed to achieve specific expression patterns.
what made up the rest of the genome besides the genes
- regulatory regions,
- introns,
- centromeres, telomeres, origins of replication
- pseudogenes
- short repeat sequences
- long repetitive elements (LINES,SINES, transposons etc)
- other intergenic DNA
what does introns contain
introns can contain regulatory regions, e.g. enhancers, alternative promoters
what does intron do
intron
- allow for exon splicing/protein isoforms (alternative splicing, therefore high protein encoding capacity)
- allow for non-deleterious integration of DNA (e.g. from viruses, transposons, etc)
what is the function of centromeres
required for chromosomal division
what is the function of telomeres
required for chromosomal stability
what is the function of origins of replication
required for replication
what is the centromeres, telomeres and origins of replication characterised by
by short, tandem repeats.
what is pseudogenes
copies of functional genes that are altered
such that they no longer have function of
parent gene
what r the types of pseudogenes
- nonprocessed pseudogenes
2. processed pseudogenes
what is nonprocessed pseudogenes
they derived from gene duplication followed by inactivating mutation or incomplete duplication
give an e.g. of nonprocessed pseudogenes
the beta-globin pseudogene .
Features of active gene: promoter, splicing junctions and open reading frames
Changes in pseudogene:
promoter mutations, splicing junctions lost, nonsense mutation and missense mutations
what is processed pseudogenes
reverse transcription + insertion = processed pseudogenes
they are continuous stretch of exons but non-functional, due to lack of promoter, or poly-A tail
what r the short repeat sequences
- microsatellites
2. minisatellites
what is common for the short repeat sequences
the microsatellite and minisatellite are all unstable
what is microsatellites
smaller than 10bp repeating unit
what is minisatellites
~10 to 100 bp repeating unit, greater number of repeats than microsatellites
what r the e.g of long repetitive elements
LINES, SINES, transposons
what is LTR retrotransposons code
LTR retrotransposons code for REVERSE TRANSCRIPTASE AND INTEGRASE activities. they are able to make DNA copies of themselves and subsequently integrate into the genome
e.g. of non-LTR retroposon
LINES
what is LINES
long interspersed repetitive elements
what does LINES encode for
they encode a reverse transcriptase.
NO INTEGRASE activity
what is SINES
short intersperse nuclear elements
what does SINES originates
SINES originates as rRNA, tRNA that has been reverse transcribed
DO NOT ENCODE THEIR OWN REVERSE TRANSCRIPTASE , may use LINE machinery for replication
what is transposons
Transposons (jumping genes) is DNA elements that encode a transposase.
the transposase excises the transposon and inserts it into a new site in the genome.
what is euchromatin
light staining
accessible DNA, “active” genes
what is heterochromatin
dark staining
inaccessible DNA, “inactive” genes
what composed of chromatin
Chromatin is composed of subunits called nucleosomes
what is proteins in the nucleosome
histone protein
what does histone protein contain
contain many lysine and arginine residues (+ charged, non-specific interaction with the DNA backbone)
what r the core histone proteins
Core proteins – H2A, H2B, H3, H4 (octamer)
what protein is in the nucleosome besine the histone octamer
linker protein-H1, it is histone protein but different function
what is the length for nucleosomal DNA (core and linker DNA length)
core DNA 146 bp
linker DNA 8bp-114 bp
what does neucleosomes appear under EM
Electronmicroscopy: nucleosomes appear as a 10 nm fiber (“string of beads”)
what does nucleosome do
they further coil for form 30 nm solenoid
Further compaction is thought to
give rise to heterochromatin
what r the characteristic of histone tails
- unstructured
- protrude from the nucleosome
- subjectto modifications
what does acetylated histones mean
euchromatin =active gnes
what does deacetylated histones mean
heterochromatin=inactive genes
what is the action on CpG dinucleotide
Cytosines in CpGs can be methylated by DNA methyltransferases
CpG summary
• Cytosines in CpGs can be methylated
• Cytosine spontaneously deaminates to
uracil, while 5meC deaminates to thymine • Uracil is recognised as “foreign” and is
replaced by cytosine
• Thymine can persist i.e mutation arises
• CpGs are underrepresented in the genome
what is CpG islands
- Areas of the genome where this is a relative abundance of CpGs
- Often lie in promoter regions of genes (particularly housekeeping genes)
• How have these CpGs survived? –
– protected from methylation (important, there are some protein protect CpG from methylated by preventing DNMT from coming in)
• DNA methylation (atCpGs) is associated with inactive genes, while a lack of methylation is associated with active, transcribed genes
• How are CpG islands protected from methylation?
– not completely sure
– mediated by transcription factors/cofactors – e.g Sp1 binds 5’-CCG CGC CCG-3’
• CpG methylation as a defence mechanism (switch off)
chromatin conserved between archaea and eukarya
- prokaryotes dont have chromatin
the archaea’s DNA is digested with MNase (micrococcal nuclease, an endonucleasease cut unspecifically in the linker region)
-the result showed that when the MNase conc. is high, only 1 band at 50bp
-> because the enzyme cut every nucleosome - when MNase conc. is lower, there are 2 bands at 50 bp and 150 bp
-> showed that the enzyme cut every 2 or 3 nucleosome
the result show archaea have smaller nucleosome, because the core histone protein in archaea is tetramers.
Why are we interested in regulatory elements in DNA?
• identifying regulatory DNA elements allows us to predict/identify their cognate DNA-binding proteins
• regulatory DNA elements + DNA-binding proteins =
“molecular switches”
– turn genes on/off(orup/down)
• understanding these molecular switches can allow us to artificially regulate gene expression
– drugs
– gene therapy
– designer proteins
regulation in globin clusters
- alpha cluster have 4 functional gene
- beta cluster have 5 functional gene
- at different developing time, the globin cluster express different functional genes
- > this is because the embryonic and fetal haemoglobin have higher affinity for oxygen, they need to get oxygen from mother’s blood.
Case study: haemoglobinopathies
• Mutated adult β-globin in beta cluster-> sickle cell anaemia-> this mutation cause protein to aggregate, clogged up in blood vessel
what is current research area for sickel cell anaemia
• Individuals with ↑ foetal γ-globin display less severe symptoms
• Can we reactivate γ-globin expression to reduce symptoms? – hydroxyurea
- study use zinc-finger transcriptional activator designed to interact with the gamma globin gene promoters enhances fetal hemoglobin producton in primary human adult erythroblasts
- in adult mice, they trying to correct sickel cell disease by interference with detal hemoglobin silencing
what r the Regulatory elements in DNA
- Promoters
- Enhancers (and silencers) • Locus Control Regions
- Insulators
what is promoter
- Region of DNA around the transcriptional start site of a gene
- Comprises short motifs that are bound by transcription factors
- “Promotes” transcription by recruiting RNA polymerase (when appropriate)
what does promoter define?
• Define the start site and direction of transcription
- > the start site have INITIATOR ELEMENT A surrounded by pyramidine Y
- > the recruitment of RNA pol by using TATA box ( in downstream elelment), the TATA box interact with TATA binding protein and recruit RNA polymerase
what is the positioning of RNA pol II
- Positioning factors are required for RNA pol to find promoters
- RNA pol II positioning factoris TFIID, which isTBP associated with up to 14 other subunits called TAFs
what does enhancers contain
• contain motifs bound by transcription factors
what does enhancer do
- LOOP TO PROMOTERS to influence transcriptionrate
- somewhat position and orientation-independent-> can be downstream of upstream, they have to be in the same double stranded DNA
what is
Locus Control Regions (LCRs)
• Enhancers that regulate gene clusters
e.g. the β-globin LCR
– allows developmentally appropriate expression of the different β-globin genes
- these LCR contain 5’ and 3’ hypersensitive sites which is hypersensitive to digestion of Nase
- LCR loops to the promoter sequence during different developmental stages-> allow different expression
what is insulators
insulators
• Block effects of enhancers, silencers and LCRs on unintended, neighbouring target genes
-> insulator prevent inappropriate activation of genes
what is the method used for detect physical interaction between regions of chromatin in vivo? i.e. looping of DNA
- folmaldehyde is used for crosslinking, enable join 2 proteins or protein+DNA force holding together
e. g. enhancer and promoter - restriction enzyme digestion cut the linker DNA
- ligase allow intramolecular ligation to join the DNA bound around enhancer and promoter
- reversal crosslinks and PCR
- sequence the DNA to find part of the DNA before binding
How do we check the activity of putative promoters and enhancers?
-> to check these promoter and enhancer actually working
• Clone the putative promoter
• Insert into an expression vector (upstream of a reporter gene)
• Why do we use reporter genes?
-> they are easy to visualise
How do we get the reporter construct into cells?
- Genegun
- Lipofection
- Electroporation
- Calcium phosphate precipitation
- Microinjection (introduce DNA into organism)
e.g of reporter genes
GFP ( green fluorescent protein )
e.g. if the cell expresss lacz-> green
if the cell express firefly luciferase-> green
how do you Identifying the minimal promoter and important motifs
it monitor the effect of putative promoter and enhancer on the expression of the reporter gene
- empty plasmid as control-> should be no expression of reporter gene
- the activity of the Bklf promoter is activated by Eklf, adding Eklf protein, the expression is increased
- using the segments to identify which part of the Bklf is important. the mission Bklf parts maybe important for expression then
- identify the sequence for expression P23L3
How do we identify the proteins that bind to promoters?
• Largely by prediction (observing motifs that fit known consensus binding sites of transcription factors)
• Test by means of in vitro gel shift assays (a.k.a band retardation assays, electrophoretic mobility shift assays) or chromatin immunoprecipitation (in vivo)
BAND RETARDATION ASSAYS: if the protein bind to the gene of interested which labelled with radioactive label, their band is retardated, stay near the well. if the protein doesn’t bind to the gene of interest, the DNA sequence will migrate furthest
how do you know if the protein is important for regulatory sequence in disease
HAEMOPHILIA B- mutation in clotting factor IX (9)
this disease can be caused by the mutation in the coding sequence or mutation in the regulatory region e.g. -20 region
the HNF4 (hepatocyte nuclear factor 4) protein bind to the -26 and -20 position of the regulatory region, the clotting factor IX will be expressed
. found if the mutation is at -26 position, there is no recovery after puberty
. found that if the mutation is at -20 position of the regulatory region, haemophilia B is recovered after puberty
-> because another transcriptional factor AR (androgen binding, T is produce during puberty, which is a ligand for AR) protein bind to -26 position.
- as long as -26 position is intact, the haemophilia B can be recovered after puberty
Androgen receptor is activated at puberty and drives the clotting factor IX gene
how does DNA regulatory element act
DNA regulatory elements act in cis
what does regulatory proteins bind? and in which way?
• regulatory proteins that bind DNA act in trans
– “transcription factors”
how did regulatory protein produced?
- the regulator gene is anywhere in the genome
- it is transcribed into mRNA
- and it is translated into regulatory protein, which is diffusible, they can go anywhere for regulation of a specific gene
- the regulatory protein bind to the target site on the structural gene in a sequence specific manner
where is the transcriptional factor normally find and give an e.g.
transcriptional factor normally bind near the promoter region. e.g. a particular TF called NF1 (nuclear factor 1) bind relatively close to the promoter
NF1 binds to 5’ CCAAT-3’ at
around -100
Why are we interested in transcription factors ?
• Each promoter is unique and is regulated by a
combination of specific transcription factors
• Transcription factors represent “molecular
switches” that can turn genes on/off (or up/down)
• Understanding these molecular switches can
allow us to artificially regulate gene expression
– drugs
– gene therapy
– designer transcription factors
How were mammalian transcription
factors discovered?
• In humans, are genes with similar functions
coordinately regulated by specific transcription
factors?
– if so, they might be expected to have similar promoters
“blood’ erythroid genes
Globin (alpha and beta) and haem biosynthesis
genes
• Alignment of promoters was difficult, but
small motifs stood out e.g GATA sites
• Perhaps there is a transcription factor that
binds to these sites to coordinately
regulate erythroid genes
• Check this by a band retardation assay
The hunt for the GATA-binding protein
• Lyse erythroid cells to get the protein
• High salt to disrupt DNA-protein interactions
• Obtain nuclear extracts to get full nuclear protein
• Affinity purification/chromatography using
IMMOBILISED DNA containing GATA sequences
– wash, wash, wash
– elute with higher and higher salt
• Test fractions for GATA-binding activity by band
retardation assays
• Run positive fractions on SDS-PAGE and sequence
• This is how the GATA-binding protein
GATA1 was discovered
– similarly, HNF4 (binds GTTAAT in liver genes
e.g clotting factor IX)
Which part of GATA1 binds DNA?
• Test a GATA1 deletion series in band retardation assays
• A central portion of GATA1 binds DNA
- the central portion contain 2 zinc fingers that binds to DNA
- the majority of mammalian TF use ZnFs to bind DNA
what r the other DNA-binding domains beside zinc fingers
• Other DNA-binding domains: basic region-leucine
zippers, basic helix loop helices, homeodomains,
ETS domains and MADS boxes
Once GATA1 has bound DNA, how might it regulate transcription?
• Which part of GATA1 is important for this?
– test truncated forms of GATA1 protein in
reporter assays
1. use full length GATA1,
2. use C-terminal deletion GATA1
3. use N-terminal deletion GATA1
the reporter gene GFP is used.
it was found that the N-terminal deletion GATA1 cause no expression of the reporter gene GFP, therefore The N-terminus of GATA1 contains an activation domain
what is activation domain
they are not folded domain, they are the short sequence of amino acid (short motifs) which r important in protein: protein interaction to influence the recruitment of RNA pol II
transcription factors like GATA1 are“modular”
– they generally have a DNA-binding domain and
one or more activation or repression domains
- contain DNA-binding domain, connecting domain and activating domain