Semester 1 Flashcards
Will the transposon DNA sequence be conserved, mutated to scape silencing or degenerate in the human genome over many generations?
They will degenerate
What characteristic distinguishes an indel from an insertion or deletion SV?
The length, indels are < or equal to 50bp
If 40% of variation in blood pressure is explained by genetic factors how much is explained by environmental factors?
60%
G + E = 100
E= 100 - G
100 - 40 = 60
Can monozygotic twins have different sexes
NO
Impossible. Sex is determined by X and Y chromosomes.
MZ twins have identical DNA and therefore inherit the same pair of sex chromosomes
A substitution that causes a change in the amino acid sequence is termed…
Non- synonymous
Missense
Nonsense
Non coding SNPs have phenotypic effects because they…
Overlap non coding genes
Change the structure and state of chromatin
Overlap transcriptional regulatory regions
Overlap post transcriptional regulatory regions
(The ENCODE project identified several classes of biochemical function of non coding DNA. Some of them overlap SNPs)
What do you call a trait controlled by a single gene?
Monogenic or Mendelian
What is an allele?
A different version of a gene or variant
What is an SNP?
A single nucleotide substitution
What is the maximum dosage of the risk allele in a bi allelic SNP?
2
The risk allele is the allele that causes disease or increases the height
How many SNPs have been identified in the human genome?
> 100 million
What is the minor allele frequency of a SNP?
The frequency of the least common allele
Rarest allele in the population
What element covers the greatest part of the human genome?
Repetitive and transposable elements
50% of our genome is junk DNA!
Many of the transposons are not working anymore, mutation in enzyme, enzyme loses function and transposon becomes silent.
Silencing RNA can silence them
Select all the structural variants
(Where a long length of DNA has been changed - KEY)
Deletion and Inversion
Deletion Means the chromosome has changed as you have less of a sequence now
Inversion means you change the whole structure of large part of chromosome.
Indels are also the same but are not considered a structural variant as are shorter in length - only 50bp.
What is a variable number tandem repeat?
A variable end to end duplications of a sequence motif
Variable number because the number of motifs will vary
Tandem repeat, repeat that is in tandem because it’s facing eachother
A polygenic trait is a phenotype controlled by…
Multiple variants (loci)
A complex trait is controlled by…
Multiple variants and the environment
What is heritability?
The proportion of disease risk explained by genetic factors
In Eukaryotes what proteins control gene expression?
RNA pol 2 - generates mRNA always
General transcription factors
Transcription factors - increase transcription (co activators)
What do repressors do?
They act as inhibitory factors to transcription
What affects the binding of proteins to DNA?
If there’s any co factors - having a + charge is beneficial, helps with interaction with - charged DNA backbone
Shape of the DNA / protein
DNA recognition sequence
Mediator contents
What is NOT part of Sanger Sequencing?
Multiple distinct DNA fragments
What can give different results according to sampling tissue?
RNA sequencing - it is tissue specific!
What is linkage disequilibrium?
It’s the correlation between the genetic variants - how likely they are to inherit it together
It is population specific
It diminishes with physical distance - snips that are closer together in the genome are usually more closely linked, so likely to be inherited together
It can be used to select snips for genotyping a race
What is false about polygenic risk scores?
They can be used for diagnostic purposes
What would be the most appropriate genetic testing method to confirm the diagnosis of Cystic Fibrosis?
Essay based question!
Sanger sequencing
Because it is a single gene, would NOT use next generation sequencing as that is done for whole genome!
Justify and say why not
Introduction about the disease
What is a mediator complex?
A multi subunit assembly that appears to be required for regulating expression of most RNA pol 2 transcripts
What is a co activator?
A protein or protein complex that activates genetic transcription usually by binding to a transcription factor
What is a co repressor?
A molecule that is capable of combining with specific repressor molecule and activating it - thereby blocking gene transcription
What is a dimer?
A chemical compound composed of 2 similar subunits or monomers
What is Hetero dimerisation?
A complex composed of NON identical components
What is homo dimerisation
A complex composed of identical components
What are transcription factors?
Proteins that move into nucleus to bind to specific sequences (response elements) and recruit additional proteins to stimulate transcription
What is a consensus sequence?
Sequence that comprises the most commonly encountered nucleotides found at specific locations in the DNA
2 main ones are pribnow box (10bp upstream of the TSS) and -35 sequence (35bp upstream of the TSS)
What does non coding mean?
DNA or RNA that is NOT translated into protein
What does polycistronic mean?
a single mRNA that generates multiple proteins
mRNA is generated from multiple genes
What is a promoter?
DNA sequence where transcription apparatus binds to initiate transcription
It indicates direction and which strand of DNA elongation occurs on, also contains TSS
What is the transcription start site TSS?
The first DNA nucleotide that is transcribed into RNA
What are transcription factors?
Proteins that move into nucleus to bind to specific sequences (response elements) and recruit additional proteins to stimulate transcription
What is false about GWAS?
They have discovered ALL of the estimated heritability for most traits
What are examples of precision medicine?
Diagnostic tests based on genetics
Gene therapy
Providing the correct dose of a drug based on a persons genetics
Stratifying patients into more similar groups for more targeted treatment and follow up
What is NOT true about Sanger sequencing?
It is used for sequencing multiple genes
The plot to describe the results from a GWAS is called…
Manhattan plot
Third generation sequencing uses…
Unamplified DNA
Linkage disequilibrium…
Is the correlation between genetic variants
Is population specific
Usually diminishes with physical distance
Is used to select SNPs for genotyping arrays
What is the length of the human genome?
3 billion - 3 000 000 000
How many chromosomes composes the human kariotype
46
What is the total physical length of human DNA
2m
What part of the gene codes for protein?
Exon
How many protein coding genes have been identified in the human genome?
20 000
What element covers the greatest part of the human genome?
Repetitive and transposable elements
How many SNPs have been identified in the human genome?
100 000 000 - 100 million
How many SNPs are different between 2 unrelated individuals?
4 000 000
What is an SNP with an impossible number of alleles?
Penta allelic
What is an indel?
An insertion or deletion with less than 50bp
What is NOT a long structural variant?
Indel
What are the least common structural variant m?
Translocations
What type of VNTR is most frequent in the coding region of genes?
Tri nucleotides
What elements are not part of the 3D organisation of the chromosomes?
Nuclear lamina
What DNA facilitates storing, reading and copying biological information?
Complementarity
How many transcripts have been identified in the RNA of all cell types investigated so far?
200 000
What is the most prevalent family of genes lost in humans when compared with chimpanzee?
Olfactory receptors
What gene function is not enriched for ultra conserved elements?
Signal transduction
How is gene expression associated with copy number?
Directly proportional
What type of gene is most enriched for copy number variants?
Non coding genes
What is a detectable output of transcription?
What methods could you use to measure the outputs?
Look at DNA binding proteins
Look at RNA
Use PCR to detect the RNA
How do we know if we have the right products for PCR?
PCR may lead to several bands on your gel of different sizes etc
How do we know what size we’re looking for?
You’ve got to use your primer sequences!!
Do a BLAST search and it searches for where those primers bind. There will be one at the beginning and one at the end - so find the distance between the two primers!
Need to know how big the product is you’re looking for as BLAST will come up with more than one product
Why may you have a high band on PCR?
Sequences that are in the primers (if they’re inside the exons) will also exist in your genomic DNA.
When you harvest cells, you are extracting RNA but some DNA may get through! May end up with some genomic copy which will be much larger because it has all the introns inside it as well as the exons
Could try changing primers to overcome this
What methods could you use to test the theory that p53 increases transcription from the PUMA promoter?
P53 = beta sheet containing transcription factors
Could you tell how much more transcription is happening?
Have 2 plates of cells: some with endogenous p53 and some where you’ve added more p53.
Compare and see when you do PCR where the specific primers for PUMA mRNA and see whether there is a more intense band where you’ve added p53 vs the non added one!
NO - looking it on a gel is qualitative, to establish that you would need to use quantitative PCR
How is quantitative PCR beneficial over normal PCR / reverse transcriptase PCR?
It monitors fluorescence
Quantitative tells you how much brighter it might be as detector tells you how bright it is in comparison to others
Why would it be a good idea to shut off translation in a situation with low amino acids?
When levels are low of certain amino acids you might end up getting use of the wrong amino acid = wrong code for protein. So want to avoid making those proteins that are missing the correct amino acids!
Don’t want to waste energy when we don’t have all the right resources to make the proteins.
Why would it be a good idea to switch off translation when cells are stressed (eg when infected with a virus)?
All viruses that infect cells may come in with their own RNA or DNA.
They are totally dependant on the translation machinery of the cell that they’re infecting
So it’s a good line of defence from the infected cell, stops virus infection! If it knows it’s infected, it can shut off protein synthesis = stops production of viral proteins
What could be the consequences of translating a protein with a PTC (premature termination codon)?
Protein will be non- functional! Can’t do job properly
Waste of energy
What situation in the cell would we want to turn on GCN4?
When amino acids are low in the cell! So need to synthesise more
eIF2 alpha is activated when amino acids are low, it is phosphorylated = a reduction in the amount of available ternary complex for translation initiation
GCN4 helps synthesise amino acids to help get out of this cell stress situation!
What is part of a eukaryotic gene?
Exons
Introns
Promoter
Enhancer
5’ and 3’ untranslated region
What is correct about transcription?
Transcription is regulated by regions of DNA far away (100s of KB away from transcriptional start site)
What are the molecular mechanisms for mRNA localisation?
Random diffusion and trapping and generalised degradation in combination with local protection by trapping - eg Drosophilia mRNAs enriched in pole plasma and posterior pole of the embryo
Directed transport on cytoskeleton - eg neuronal RNPs along microtubuli and ASH1 mRNA along actin filaments to bud tip in yeast
What is required for ASH1 mRNA transport to the bud tip of dividing yeast cells?
She2, She3 and Myo4 (Myosin) proteins are required for ASH1 mRNA transport along acting ‘cables’
Khd1p is a translational repressor
Additional proteins are associated with the core (She2-She3-Myo4) RNP complex for …
RNA transport
RNA cargo is released via phosphorylation of a translational repressor (Khd1p) at the bud tip.
How does this happen?
Finally, the transport machinery is released and enables local translation of ASH1 at the bud tip.
ASH1 protein in daughter cells prevents mating type switching
Khd1 protein co localises with a subset of the bud localised …
mRNAs
Xist is a large (17kb) cis acting regulatory IncRNA.
What is X inactivation specific transcript?
XIST associates with the X chromosome that it was expressed from (cis regulation)
Initiates Histone modifications (methylation, deacetylation) which results in heterochromatin formation
XIST gene.
What diseases is the unbalanced X expression associated with?
Unbalanced X expression involved in diseases associated with infertility and mental retardation
Eg Turner, Rett, Kliefelter’s Syndrome
What are the recurring concepts in post transcriptional gene regulation?
And how is specificity given?
Cis acting elements in mRNAs
Trans acting factors
Specificity given through: RBP / complexes that bind to sequence / structural elements in the RNA
ncRNAs in association with RBPs anneal with sequences in the RNA target
What are the cis acting elements in the RNA?
Polyadenylation signals in pre mRNAs
Splice sites in pre mRNAs
Regulatory elements in mRNAs influencing translation / localisation
Trans acting factors are specific RNA binding proteins that often act as a scaffold for assembly of larger ribonucleoprotein complexes.
Examples of these?
She2 protein (RBPs) as an adaptor for localisation of mRNA to the bud tip in yeast
Iron response protein (IRP) binding to iron response element (IRE) in 5 UTR of ferritin mRNAs
What are examples of the trans acting factors: Ribonucleoprotein complexes?
Cleavage and polyadenylation (CPSF, CStF, CF and PAP)
Spliceosome (U1, U2, U4, U5, U6 snRNPs)
Exosome (RNA degradation in nucleus and cytoplasm)
What are examples of target mRNA site selection by base pairing with non coding RNAs?
Small nuclear RNAs (snRNAs) base pair with sequences in the pre mRNAs for splicing
miRNAs partially hybridise with sequences in 3 UTR of mRNA target to repress gene expression
Codon anticodon (tRNA) to read the genetic code
Other examples: siRNAs, snoRNAs, IncRNAs
Long non coding RNAs (IncRNAs) are defined as?
Being at least 200nts long
Exons are generally -10 times shorter and more uniform than introns.
What is the size distribution of exons?
Average length of human exons is 150 bases
On average 10 exons / gene
What is the size distribution of introns?
Average length of human introns 1,5 kB
Largest intron is 1,1 Mb (intron 5 in KCNIP4)
Since introns can be very long, additional strategies are required to improve splice site selection!
What is the average length of human exons?
150 nts
Different modes for alternative splicing have been observed in:
Exon skipping
Intron inclusion
Alternative splice site selection
Alternative transcriptional start site
selection
Mutually exclusive exons
What does cleavage and polyadenylation at the 3’ end of eukaryotic pre mRNAs require?
CPSF - cleavage and polyadenylation specificity factor
CstF - cleavage stimulatory factor
Two cleavage factors (CFI, CFII)
Poly(A) polymerase (PAP)
Which of the following factors is NOT involved in the 3 end formation of eukaryotic pre mRNAS?
DICER
What is wrong regarding miRNA and siRNA?
Both repress translation by imperfect hybridisation with mRNA targets
What is true about both miRNA and siRNA?
Both are 20 to 22 nucleotides long RNA molecules
Both anneal with complementary sequences preferentially located at 3 UTR of mRNAs (in complexes with several RBPs)
siRNA may have originated as a defense mechanism
miRNA and siRNA are NOT present in eukaryotes
Small interfering RNAs (siRNAs) cleave mRNAs upon … hybridisation
Perfect
How do you break up the cells and separate the RNA from proteins and lipids and other contents of the cells we are not interested in?
(Practical)
Resuspend the cells in Lysis Buffer and vortex for 10s to mix thoroughly
Can you amplify genomic DNA using RT PCR (reverse transcriptase pcr) technique?
False
To extract RNA, we use a silica membrane. Why?
(Practical)
RNA binds to the membrane allowing all the impurities to be washed away and removed
What is the correct use of pipettes?
- Press until STOP1 to expel air from outside the container
- Go into container at 90 degrees and slowly release plunger to draw liquid in
- Move the loaded pipette to the desired well and release the plunger to STOP2
- Remove the tip from the well and only then release the plunger to resting position
To draw up liquid with a pipette, the correct way is…
You hold the pipette perpendicular to the reservoir and touch the liquid as little as you can to avoid carryover, while confirming visually that it is loading correctly
At the beginning of the practical, we need to separate differentiated from undifferentiated MEL cells from culture solution. What is the correct procedure to do this?
Put the labelled tubes in the centrifuge, so we can pellet cells at the bottom of the tube and then remove the media
RT PCR stands for…
Reverse transcriptase PCR
What do we call a sample in which a known template (RNA or DNA) is added to check if reaction works properly?
Positive control
During PCR, what is the stage where we apply high temperatures to separate DNA strands?
Denaturation
Our PCR amplicons are designed to cross introns. This means that PCR products are generated from RT PCR will be larger or smaller than the ones from genomic PCR?
Smaller
How many master mixes do you need to prepare for genomic PCR?
(Practical)
Two
What is true about tRNAs?
Deliver amino acids to the ribosome
Which factor binds to the 3’ end of eukaryotic cellular mRNAs?
PABP - poly A binding protein
Which protein escorts the tRNA initiator to the 40S subunits?
elF2
Which protein unwinds RNA structure to allow 43S scanning?
eIF4A - protein with helicase activity that physically unwinds RNA and uses ATP to do it
Can’t do it without 4G and 4E
How can stress impact ternary complex availability? What are the steps?
What are the two outcomes?
Stress (amino acid insufficiency, unfolded proteins, viral infections)
eIF2a kinase activation
eIF2a ph - reduced eIF2B available
Reduced eIF2 GTP tRNA (ternary complex) availability
Two outcomes: Global translation downregulation and Selective translation up regulation
What does PERK defect?
Unfolded proteins in the ER
What does PKR detect?
Viral infections
What does HRI detect?
Low heme
What does GCN2 detect?
Low amino acids
What does PKR detect?
Double stranded RNA - dsRNA
What does PERK detect?
ER stress
Why is global translation down regulation helpful?
Don’t waste resources
Redirect energy to stress recovery
Don’t make faulty proteins when AAs are scarce
Stop viral protein production
Allow proper protein folding
Why is selective translation up regulation helpful?
Produce products to recover from stress eg GCN4 , ATF4
Turn translation back ON (GADD34)
GCN4 is a transcription factor activating more than 50 genes involved in amino acid synthesis.
Is it produced under normal conditions?
NO
Ternary complex is abundant under normal conditions
What happens under amino acid stress conditions?
uORF translation impaired by eIF2a P and GCN4 translation is promoted
Difference between ternary complex in normal vs amino acid stress conditions?
Normal - ternary complex abundant
Stress - ternary complex limiting, only downstream AUG recognised
RNA structure impairs cap binding and scanning. Translation on these structured mRNAs is highly eIF4E dependant.
Unstructured 5 UTR =
Structured 5 UTR =
Easy scanning
Tough scanning
eIF4E over expression increases efficiency of…
Cap binding
mRNAs with structured UTR include…
c myc, VEGF, cyclin D1, FGF
Therapeutic suppression of translation initiation factor eIF4E expression reduces tumour growth without …
Toxicity
ASO antisense oligonucleotide can reduce eIF4E expression in … cell lines
Human tumour
Which of the following statements about eIF4E is TRUE?
The mTOR signalling pathway can regulate eIF4E activity - doing that through the phosphorylation of 4E BP
What is part of a eukaryotic gene?
Exons
Introns
Promoter
Enhancer
5’ and 3’ untranslated region
What is correct about Transcription?
Transcription is regulated by regions of DNA far from (100s KB away from the transcriptional start site)
What controls transcription in prokaryotes?
Operator regions - can recruit transcriptional repressors to promoter
Cis regulatory elements - can recruit transcriptional activators to the promoter
Transcriptional activators ONLY bind to cis elements
True or false?
True - in prokaryotes they can only activate transcription
Ligand dependant binding changes the conformation of a protein so that it can bind to DNA.
Are these types of proteins always acting as activators of transcription?
NO
Eg Tryptophan was required for binding to the operon
Eg cAMP was required for activator binding to the cis regulatory elements to initiate transcription
What are transcriptional activators?
Promotes regulator binding
Recruit RNA polymerase 2
Releases RNA polymerase 2 either to begin transcription OR from a paused state
What are broad promoters?
Require assembly of multiple independent protein complexes to form across Kbp of DNA
What are sharp promoters?
Controlled by the binding of fewer protein complexes located over a shorter span or non coding DNA
What can affect the strength of a promoter?
There can be more than one TSS
There does NOT have to be a TATA box
Chromatin structure can override all of this
What does the term Epigenetics mean?
Modification / changes of gene expression that aren’t caused by any changes in the base pair sequence within the genetic code
Includes: condensation and relaxation of chromatin / activators and repressors / chromatin modifiers
What are the components of chromatin?
Two molecules EACH of H2A, H2B, H3 and H4 histones. Makes an octamer
146nt of DNA winds 1.65 times around histone core to = nucleosome
Hydrogen bonds form between DNA and histone octamer
Each nucleosome separated from next by 80nt of linker DNA
Histone H1 works as a clamp!
How does chromatin control gene expression?
Heterochromatin = short and tightly bound
Euchromatin = long and loose
More tightly wound structure = less access to bases and transcription factors
Folding protects DNA from being exposed for usage
Unfolding will expose DNA to transcription factors = allows promoters to be bound by GTFs and other TF to allow genes to be expressed
Nucleosomes prevent … access by general transcription factors and RNA pol …
Promoter
2
Transcriptional activators recruit Coacivators including:
Histone modification enzymes
ATP dependant chromatin remodelling complexes
Histone chaperones
What are the 4 main mechanisms that change chromatin structure?
Nucleosome sliding
Nucleosome eviction
Histone variant exchange
Histone tail modification and DNA methylation
Sugar phosphate backbone of DNA is … charged so interactions between the histones and the DNA requires positively charged … contacts
Negatively
Amino acids
Generally, regulators bind DNA in nucleosomes with lower affinity than to naked DNA because…
Cis regulatory sequence facing inwards
Changes to the shape of the binding site due to associated protein binding
Broad promoters contain multiple
… elements and attract transcription factors that influence … in a variety of ways
Cis
Transcription
Other epigenetic mechanisms:
DNA methylation - direct modification of DNA bases
Interactions between DNA modification and protein modifications
Epigenetic’s is the study of how gene expression can be changed WITHOUT changing the DNA sequence.
What 2 mechanisms is it controlled by?
DNA methylation
Histone modification
Euchromatin is decondensed chromatin.
What causes the relaxed structure?
Replication
Transcription
DNA repair
Heterochromatin is condensed chromatin
What causes the condensed structure?
Inhibition of transcription
Cell division
What does nucleosome sliding require?
Histone chaperones - eg nucleosome assembly protein 1 (NAP1)
ATP dependant chromatin remodellers - eg SWI2/ SNF2
All are ATP dependant complexes which interact with histones directly
nucleosome eviction
Removal of entire nucleosome
ATP dependant process
May occur in conjunction with Histone exchange
What is Histone variant exchange controlled by?
Histone chaperones - eg Asf1
ATP dependant chromatin remodellers eg SWR complex
What are Histone tail modifications?
Changes in amino acid properties change the interaction interface with DNA
HISTONE tails help to ‘grip’ the DNA so changes here can affect the attraction between them
Overall charge is important for …
DNA: protein interactions
Arginine modification
Always retaining a + charge
Different enzymes catalyse the addition of the 1st methyl group to the addition of subsequent methyl groups
Added by Histone methyl transferases
Removed by lysine demethylases
Sumoylation and ubiquitinylation
Occurs on lysine residues
Neutralises the positive charge
Phosphorylation
Mostly at serine (but also threonine and tyrosine) residues
Generates a negative charge
Some modifications are in competition because:
Occur at the same residue - eg lysine Acetylation and methylation
Occur on consecutive amino acids, creating a steric hindrance
Occur on consecutive amino acids and binding of 1 site stimulates binding of further proteins
Modification by the components of the … or Coactivator determine the … of the changes
Mediator complex
Longevity
3D bending of the DNA and binding of proteins to the … may prevent transcriptional initiation or progression of the RNA pol … complex
Insulator
2
Hearing is the ability to perceive sounds.
What are sound waves?
Alternating high and low pressure regions travelling in the same direction
Originate from a vibrating object
Frequency of sound vibration = pitch
Amplitude = how loud (decibel, dB)
What is the audible range of the human ear?
Audible range = 20- 20 000 Hz
Hears most acutely between 500-5000Hz
What are the 3 main regions of the ear?
External ear - collects sound waves and channels them inward
Middle ear - conveys sound vibrations to the oval window
Internal ear - houses receptors for hearing and equilibrium
Physiology of hearing
- Auricle directs sound waves into the external auditory canal
- Tympanic membrane vibrates back and forth
- Vibrations transmit to malleus, incus and stapes
- Stapes vibrates in the oval window
- Fluid pressure waves in the peri lymph of the cochlea
- Pressure waves transmit from scala vestibuli to scala tympani to the round window so it bulges into middle ear
- Pressure waves deform walls of scalea vestibuli and scala tympani
- Pressure waves cause basilar membrane to vibrate
- Move hair cells out of spiral organ against tectorial membrane
- Stereocilia bend and generate nerve impulses in 1st order neurons in cochlear nerve fibres
Sound transduction - what happens when hair cell is at rest?
Stereocilia point straight up
Cation channels are partially open
Weak depolarising receptor potential
Ca+ ions enter
Low level NT release
Sound transduction - what happens when hair cells are stimulated?
Vibration of basilar membrane
Stereocilia bend and open cation channels
Larger numbers of K+ ions enter the cell
Strong depolarising potential
More Ca+ channels open
Increased NT release
DNA methylation
DNA methylation is carried out by DNA Metyl Transferases (DNMTs)
De novo DNMTs add methyl groups to unmethylated DNA - eg DNMT3a and DNMT3b
Other DNMTs add methyl groups to daughter stand during DNA replication - eg DNMT1
DNA methylation
In around 1% of nucleotides of the genome
Mostly occurs at CpG islands
Causes different effects depending on the gene sequence
What does DNA methylation have a role in?
Regulating tissue specific gene expression
Genomic imprinting
X chromosome inactivation
How do CpG islands affect expression?
Regulation of chromatin structure AND transcription factor binding by:
Less nucleosomes
Often close to TSS
Usually encompass transcription factor binding sites
Methylation of exon 1 helps recruit TF
OPEN UP STRUCTURE AND RECRUITS ENZYMES
How does DNA methylation affect expression?
Regulation of chromatin structure AND transcription factor binding by:
Recruitment of inhibitory TF
Disrupting binding to TF binding sites
Methylation of promoter causes gene silencing
STABLE SILENCING
What are epigenetic readers?
Chromatin structure can be changed with the association of a reader complex which is similar in theory to the mediator complex
What are epigenetic writers?
The writers are the enzymes that change the modifications - eg Histone acetyl transferases, Histone methyl transferases and DNA methyl transferases
What are epigenetic erasers?
The erasers are enzymes that modify or remove these marks: Histone deacetylases and lysine demethylases
How is epigenetics spread?
- Regulatory proteins bind to specific cis sequence
- These recruit Histone modifiers (writers)
- A reader protein recognises the modification to the Histone / DNA
- A writer recognises the reader and binds to it, providing a platform from which the writer can make another modification on the next nucleosome (goes back to step 3!)
What happens in the epigenetic modification: expression?
Histone tails methylation (me3)
Inhibition of DNMT
What is the epigenetic modification: repression?
Methylation of DNA and Histone tail modification
Recruit methyl binding proteins (MBD)
Recruit Histone tail modifiers (eg HDAC and HMT)
Barrier sequences can stop the spread of changes to chromatin structure, creating CHROMATIN DOMAINS that are regulated separately.
Examples are:
Tethering a chromatin domain to a large fixed site, such as the nuclear pore complex
Strong binding of a group of nucleosomes to a barrier protein
Recruitment of a mediator complex containing chromatin modifying enzymes to erase modifications that will spread changes
Summary of controlling gene expression
A) competitive DNA binding
B) masking the activation surface
C) direct interaction with the general TF
D) recruitment of chromatin remodelling complexes
E) recruitment of Histone deacetylases
F) recruitment of Histone methyl transferase
Epigenetic case study: Amyotrophic lateral sclerosis (ALS)
Progressive disease that affects motor neurons
CAUSE = overall decrease in acetylated Histone levels within motor neurones by:
Reduction in histone Acetyl transferase (HAT) activity
Increased HDAC activity
Epigenetic’s case study: Fragile X
Caused by silencing of FRM1 gene
CGG tri nucleotide repeats in the 5 UTR FRM1 associated with disease onset
Expansion of number of repeats (>200) - hyper methylation of promoter in an attempt to turn off expression
C turned into 5mC by DNMTs = interaction with Histone marks / compacted chromatin structure / harder for TF to be recruited and enhance expression
What is cancer?
Defined as the continuous uncontrolled growth of cells
What are tumours?
Any abnormal proliferation of cells
They are classified as to their cell type
Can arise from any cell type in the body
Benign - stay confined to original location
Malignant - capable of invading surrounding tissue and spreading to entire body
Epithelial cells can give rise to tumours called…
Carcinomas
Carcinomas cause 80% of cancer related death
Examples of carcinogens that can cause cancer
Ionising radiation - x rays, UV light
Chemicals - tar from cigarettes
Virus infection - papilloma virus can be responsible for cervical cancer. Hepatitis B and C viruses = liver cancer
Hereditary predisposition - some families are more susceptible to getting certain cancers
What 7 type of cancers can alcohol cause?
Mouth
Upper throat
Oesophagus
Breast
Liver
Bowel
4 ways that alcohol causes cancer
Damages cells
Increases damage from tobacco
Affects hormones linked to breast cancer
Breaks down into cancer causing chemicals
Tobacco can cause … types of cancer
14
Chemical carcinogens
Thousands have been identified
Broad range of chemical structure
Classified as direct acting and indirect acting
Carcinogens cause mutations:
Can cause DNA damage, change sequence of DNA bases, change codon and affect protein function
Most commonly occurring spontaneous change in DNA:
Oxidative damage
Spontaneous purine hydrolysis
Deamination
How to identify carcinogens (compounds that can cause mutations): Ames Test
Tris 2,3- dibromopropyl phosphate used as a … in plastic and textiles. Furylfuramide used as an … in food in Japan
Using bacteria that require … to grow > growth means they attained the ability to … histidine as a result of mutation (revertant bacteria)
Flame retardant
Antibacterial additive
Histidine
Synthesise
Benz(a)pyrene
Epidemiological studies suggested smokers had …incidents of lung cancer
Cigarette smoke contains about … carcinogens and they can induce mutations in different genes
60% of lung cancers have inactivating …in the p53 gene, which is a major tumour … gene
High
60
Mutations
Suppressor
Purine hydrolysis
DNA damage events in a single cell range from 10^4 to 10^6
Under physiological conditions, spontaneous purine hydrolysis takes place leaving a sugar without the attached base leading to generation of AP site
Routes to oxidative DNA damage dependant mutagenesis
Mispairing of 8 oxoG with adenine during replication leads to C to A causing GC base pair to mutate to TA
GC to TA mutation can also result from replication encountering an AP site
Deamination can induce point mutations and mismatch
Cytosine and 5 methylcytosine are most common deamination reactions
DNA bases can be deaminated and if unrepaired can cause mutations
Deamination can change a GC base pair to AT
Adenine and guanine can also be damaged by …
Both hypoxanthine and Xanthine can pair up with …
Deamination
Cytosine
Deamination changes GC base pair to … base pair
AT
Incidents of cancer increase with…
Age
Multi hit model of cancer induction explains why cancer incidence rise with age:
Predicts increase in cancer incidence with age
Cancers arise by an evolutionary selection process following the theory of ‘survival of the fittest’
What are the 4 mutations of the Multi hit model of cancer induction?
1st mutation - gives a slight growth advantage to mutant cells
2nd mutation - mutant cells grow more uncontrollably and form a small benign tumour
3rd mutation - mutant cells overcome constraints imposed by tumour surroundings. Outgrow others to form a mass of cells, each of which has all 3 genetic changes: 1st, 2nd and 3rd mutations
4th mutation - cells can escape (and survive) into the bloodstream and establish daughter colonies at other sites (hallmark of metastatic cancer)
Exogenous and endogenous agents can …the DNA
Damage
Carcinogens are mutagens, they induce changes in DNA sequence
IF REMAIN UNCORRECTED can lead to …
Mutations
Ames test can be used to detect potential …
Carcinogens
Repeated mutation and their accumulation over time in cells explains why tumour formation is a gradual … process which can take several years for cancer to develop
Multi step
Pentose phosphate backbone
5’ phosphate
3’ hydroxyl
5-3 directionality
Phosphodiester bond
Bases on the same side
Negative charge outside
Sequence of bases forms the primary structure
3 hydroxyl is the substrate for DNA polymerase
SSBs are constantly and spontaneously … in cells
Generated
SSB repair is promoted by Poly ADP ribose polymerase (PARP) enzyme.
This leads to … of one of the EXCISION REPAIR pathways to … the damaged DNA
Activation
Repair
Inability to repair SSB converts them into DNA double strand break (DSB) when the cells …
Divide
DSB are highly … to cells, any unresolved DSBs are sufficient to kill cells: this is what … causes
Toxic
Radiotherapy
DNA exonuclease can hydrolyse a phosphodiester bond on DNA …
Terminini
DNA endonuclease creates a nick in between DNA chain by … (hydrolysis) a phosphodiester bond
Cutting
DNA ligase closes the nick by forming a … bond
Phosphodiester
DNA polymerases mediated proof reading
DNA pol is the first line of defence against mutations
DNA pol in E coli introduces 1 wrong base for every 10K bases incorporated, but the mutation rate is 1 wrong base per 1 billion
Low mutation rate is due to the proof reading (3-5 exonuclease) function
Proof reading is vital for all cells to avoid excessive mutations
PARP1 is critical in recruiting DNA repair proteins
PARP identifies DNA damage and signals the need for repair
PARP1 detects DNA damage via its DNA binding domain
Activated PARP1 adds ADP ribose units to PARP1 and leads to the formation of long and branched chains of poly (ADP ribose) (PAR)
These PAR chains create a scaffold that recruits critical proteins for DNA repair
Replication errors are corrected by mismatch excision repair (MER)
Inheritable loss of function mutation in msh2 or mlh1 genes cause predisposition to non polyposis colorectal cancer
What is DNA pol b?
A specialised DNA polymerase used in repair
Nucleotide excision repair (NER) corrects … DNA damage
UV induced
NER defects lead to …
Xeroderma pigmentosum
Repair of DNA double stand break (DSB) by homologous recombination
DNA DSBs are highly toxic, a few sustained DSBs are enough to kill a cell
Radiotherapy and many anti cancer drugs destroy tumour by causing DNA double stand breaks
Incorrect joining of DSBs can create hybrid genes and can place a low expression gene under control of a strong promoter
Cells use 2 processes to carry out DSBs repair:
- Non homologous end joining (NHEJ) - error prone
- Homologous recombination (HR)
What is synthetic lethality (SL)?
Exploiting tumour defects to cure cancer
It arises when a combination of deficiencies in the expression of two or more genes leads to cell death, whereas a deficiency in only one of these genes does not
PARP repairs DNA SSBs so they are … converted into double strand breaks
PARP inhibitor drugs cause … to convert to DSBs
Not
SSB
DNA damage is constantly occurring and therefore DNA repair pathways evolved to maintain …
Their failure can lead to …
Genomic integrity
Diseases
At least 3 types of excision repair mechanism can correct SS DNA damage in the DNA using … sets of enzymes in a sequential manner
Specific
PARP (by regulating SSB repair) and BRCA (by regulating DSB repair) coordinate in maintaining the … of the DNA in …
Integrity
Breast cancer
BRCA (in breast / AR in prostate) deficient tumours cannot afford to repair DSB, they rely on … activity to ensure DNA SSB are repaired and DO NOT progress to …
PARP
DSB
BRCA and PARP are synthetically lethal in … cancer.
Androgen receptor AR and PARP are synthetically lethal in … cancer
Breast
Prostate
Hallmarks of cancer are biological capabilities acquired during multistep development of human tumours
What are they?
- Sustaining proliferative signalling - oncogenes
- Evading growth suppression - tumour suppressor genes
- Reprogramming of energy metabolism
- Inducing angiogenesis
- Genome instability and mutations
What are the 3 broad categories of genes implicated in cancer?
Proto oncogenes
Tumor suppressor genes
Genome maintenance genes - mutation allows propagation of gene mutations as the DNA repair is inefficient
Proto oncogenes and oncogenes
Rationale for their existence: required for normal growth and proliferation
Proto oncogenes (ras) - gain of function mutations convert them into oncogenes
Activation of oncogenes can trigger …
Cancer
Mechanisms of gain of function mutation
- Point mutation (a change in single base pair) - altered protein product
- Chromosomal translocation - fuses 2 genes together to produce a hybrid gene
- Amplification - generation of numerous proto oncogene copies, leading to overproduction of the encoded protein
Loss of ligand (growth factor) dependant receptor activation
As a result of ‘gain of function point mutations’ in the receptor tyrosine kinases, the requirement of growth factor (ligand) to trigger receptor dimerisation may be abolished allowing cells to grow rapidly and uncontrollably
Oncogenic receptor can promote proliferation without …
Growth factors
Growth factor can trigger autocrine activation in cancer
Normally the receptor and ligand are produced by different cells
In some cancers, BOTH of these can be made by the same cell, losing regulated control of growth / division
Eg EGF and EGFR being made by the same cells in cancer
Her 2 kinase is an oncogene
1/3 Breast tumours express Her2, triggering sustained proliferation of cancer
Trastuzumab (a monoclonal antibody) can target Her2 leading to repression of 1) Her2 mediated growth signalling
2) destruction by the immune system
Transcription factors can drive cancers
Prostate cancer stimulated by androgen (testosterone), the male sexual hormone > activates androgen receptor AR
Breast cancer stimulated by estorgens (female sex hormones) > activates estrogen receptor ER
AR and ER are example of nuclear hormone receptors
Nuclear hormone receptors are therapeutic targets
Enzalmutamide = AR antagonist
Tamoxifen = ER antagonist
Tumour suppressor genes
Ensure cells with oncogenes are repaired / killed, hence anti cancer
As long as they are intact, cancer cells should NOT survive
Many cancers have inactivation of tumour suppressors and that’s how cells with oncogenes can make tumours
Rationale for existence: controlling cell cycle checkpoints and development. They also regulate ‘apoptosis’
Tumour suppressor genes
Eg Rb / APC / p53
Loss of function mutation deletes an important brake on the cell cycle or checkpoint control
Hypoxia =
Lack of oxygen
Angiogenesis =
Formation of new blood vessels
Apoptosis =
Cell death
Senescence =
Cellular ageing
Normal cells
Proto oncogenes + tumour suppressor genes =
Regulated cell growth and proliferation
Cancer cells
Gain of function mutation in proto oncogene + loss of function mutation in tumour suppressor genes =
Abnormal growth and cancer
Hallmarks are …
Specific characteristics found in cancer cells
Oncogenes promote … while tumour suppressor genes repress ..
Tumour growth
Cancer
Nuclear receptors as well as many gene fusions can become … as a result of mutations
Oncogenes
AR in … and ER in .. are potential oncogenes
Prostate
Breast
One of the ways that ATM participates in DNA repair is via activation of … tumour suppressor function
P53
Rb protein puts a break in … and thus cell division by trapping E2F transcription factors
DNA replication
Viruses not only carry oncogenes but some of their proteins can … tumour suppressor gene function
Inactivate
Blocking oncogene function forms the basis of …
Cancer therapy
Tumour suppressor genes maintain … and ensure cells don’t divide with a damaged DNA
Genomic integrity
What are the effects of condensed chromatin structure? (Making heterochromatin)
Distance between nucleosomes decreases
Further packaging of DNA into solenoid then looped structures, which requires interaction with another level of proteins
Cellular effects: decrease in transcription from some gene promoters
What can we monitor if we want to determine the state of the chromatin structure?
What can we monitor to identify the effect that it’s changes in structure are having?
- mRNA levels
- Measure modification status of Histone complex - if acetylated lysine on histone H3 = transcription about to be turned on. If methylated = transcription turned off
What are the steps of Chromatin Immuno Precipitation (CHIP)?
- Cross link DNA and proteins and isolate chromatin
- Sonicate or digest chromatin
- Immuno precipitate, reverse cross linking, purify DNA
What comes after CHIP?
Detect by microscopy (chip, chip)
Or
Detect by sequencing (chip seq)
Or
Western blotting procedure
Methods of detecting DNA methylation:
BS Seq (bisulphite sequencing)
- Sonicate DNA
- Bisulphite conversion
- PCR amplification: Sanger sequencing or pyro sequencing (faster)
Sanger (dideoxy) sequencing is based on…
DNA replication
Pyro sequencing is based on…
PCR
Viruses not only carry … but they can cause … by inhibiting tumour suppressor gene function
Oncogenes
Cancer
In normal cells, oxygen determines the destiny of …
Lactate is formed under conditions of …
Pyruvate
Hypoxia
Warburg effect
(Aerobic glycolysis)
Warburg observed that tumours were taking up enormous amounts of glucose compared to surrounding tissue
This glucose was fermented to produce lactate even in the presence of oxygen
Because aerobic glycolysis is energetically INefficient - cancer cells increase glucose uptake
2 deoxyglucose phosphate (2 DG P) competitively inhibit the production of …
Fructose 6 phosphate
2 hydroxyl glutarate (2HG) is an Oncometabolite
Mutant IDH (common in cancers) converts iso citrate into 2HG
2HG inhibits several enzymes that require alpha keto glutarate
Benign tumours
DO NOT invade or metastasise eg Wart
Malignant tumours
Can invade nearby tissue, spread and seed additional secondary tumours in distant organs
VEGF promotes angiogenesis.
What is this?
What stimulates VEGF production?
Formation of new blood vessels to allow nutrition to reach tumour
Stimulated by low oxygen conditions - hypoxia
How cells sense and adapt to oxygen availability
- Low oxygen conditions enhance erythropoietin (EPO) levels
- EPO is a growth hormone produced by the kidney
- Under normoxia (normal blood oxygen concentration) EPO is synthesised by inner cortex
- Under hypoxia, cells of kidney can produce more than 100x EPO over normoxia
- Tumour hypoxia is a COMMON phenomenon in solid tumours
Erythropoietin (EPO) stimulates red blood cell …
Production
Hypoxia induces HIF 1a transcription factor activation that promotes aggressive … phenotype
Tumour
Pathways regulated by hypoxia:
Angiogenesis, tissue invasion and metastasis
Expression of VEGF is responsible for induction of … which is associated with tumour …
Angiogenesis
Progression
Altered bioenergetics in cancer cells is energy … but makes cancer cells fitter and more …
Inefficient
Competitive
Oncometabolites can be produced as a result of … in a metabolic pathway
Mutations
Cancer cells make new highways by … to spread themselves in distant organs
Angiogenesis
HIF 1a promotes angiogenesis by activating genes such as … that promote angiogenesis in cancer
VEGF
What is the central dogma?
Genomic DNA > transcription > mRNA > translation > protein
Where are prokaryote 70S ribosomes (30 +50) found?
Free in the cell
Where are eukaryote 80S ribosomes (40+60)found?
In the cytoplasm, ER associated
Can be localised to specific intracellular areas eg dendrites in neuron
What is the initiation step?
Recruitment of initiation factors and ribosomal subunits
Recognition of AUG start codon
Why is the initiation step the target for control mechanisms?
Because it is the rate limiting step of mRNA translation
What is the elongation step?
Recruitment of elongation factors and acyl tRNAs
Building of polypeptide chain
What is the termination step?
Recruitment of release factors
Recognition of stop codon and release of polypeptide chain
How is eukaryotic translation different to prokaryotic translation?
Most of it is cap dependant - starts at 5 prime cap (ribosome had to find the beginning)
In prokaryotes, a Shine Dalgarno sequence helps to recruit ribosomes directly to RNA at any point of that mRNA
What is the function of the 5 cap structure?
Protective for degradation and crucial for translation
What is the open reading frame ORF?
Only part that contributes to the code of the protein being produced
What is the function of the 3 poly A tail?
Translational control
What happens during translation initiation?
Cap binding
5-3 scanning to find AUG, requires: unfolding of 5 UTR and ATP hydrolysis
AUG recognition - codon / anti codon pairing
GTP hydrolysis / tRNA delivery
eIF release + large subunit 80S joining
GTP hydrolysis x2 and eIF release
Why is it important to locate the right AUG start codon?
To avoid frame shifts
Particularly important for 1st codon to match up the right AA in the right place
Nothing else will be coded for properly for the remainder of translation otherwise
Prokaryotic and eukaryotic translation have similar pathways, especially in the … phase
Elongation
Rate limiting steps are key targets for … and …
Regulation
Dysregulation
What is involved in cap recognition?
eIF4E sequestration, phosphorylation
eIF4G cleavage
What is involved in ternary complex formation?
eIF2 - GDP recycling
eIF2 phosphorylation
Initiation is the most … step
Regulated
What are the key steps during translation initiation?
Cap recognition
Scanning
AUG start codon recognition
What happens to eIF4G structure in apoptosing cells?
Caspase 3/7 cleavage of eIF4G
Disruption of cap binding complex
Cap dependant translation
Overall down regulation of translation in apoptosing cells
What is an IRES?
A structured RNA domain that recruits the ribosome
IRES dependant translation
No need for the cap recognition and scanning
Require only few eIFs
Present on cellular mRNAs and viral RNAs
What is a cistron?
Open reading frame that encodes for reporter proteins
What normally happens with cistrons in mRNA?
If 2 open reading frames are next to eachother in capped mRNA, the 1st cistron will be translated to = product
But at the end of translation, the ribosome will terminate so no translation will happen downstream of 2nd cistron
How can we test whether RNA sequence is functioning as an IRES?
Put the sequence between two cistrons!!
What are bicistronic reporters ?
Gold standard to define IRES function
Unstructured 5 UTR =
Easy scanning
Structured 5 UTR =
Tough scanning - poorly translated in a cap dependant environment
How do eIF4E levels regulate cap recognition?
eIF4E over expression increases efficiency of cap binding
mRNAs with structured UTR include:
c myc
VEGF
cyclin D1
FGF
How does increased eIF4E levels lead to cancer?
Leads to increased translation of a subset of normally poorly translated mRNAs - eg VEGF, c myc
Causes angiogenesis, proliferation, evasion of apoptosis, metastasis
= CANCER
Breast / prostate cancers display high levels of…
Phospho 4E BP
Examples of mitogenic signals?
Cytokines
Hormones
Growth factors
Regulation of cap recognition
eIF4A /B phosphorylation, eIF4E over expression, 4E BP phosphorylation and eIF43 phosphorylation all lead to..
Increased cap recognition which leads to …
Increased translation of specific mRNAs
eIF2 GDP means translation is…
OFF
eIF2 GTP tRNA means translation is…
On
How is ternary complex formation regulated?
Regulated by phosphorylation of eIF2 alpha subunit of eIF2
dsRNA binding motifs are linked to…
Viral RNA recognition
Kinase (ser / thr) catalytic domain is linked to …
eIF2 phosphorylation
PERK (PKR like ER localised kinase) prevents from protein … related to ER stress.
Overload
PERK defect can lead to Wolcott Rallison syndrome.
What is this?
Growth retardation and diabetes
Imbalance between folded and unfolded protein
What happens when there is a eIF2B mutation?
Impaired eIF2 recycling = less Protein synthesis and fail to recover protein synthesis after stress
eIF2B mutations cause disease such as leukodystrophy (vanishing white matter)
What is rhis?
Neurological deterioration
Motor dysfunction
eIF4E has a key role in translation …
Regulation
eIF2 recycling is regulated via …
eIF2 alpha phosphorylation
eIF2 alpha phosphorylation leads to global … in translation
Reduction
What happens when things go right in translation?
Translation initiation proceeds normally
Multiple ribosomes (poly somes) translate nascent peptide chains
What happens when there’s a PTC (premature termination codon) in translation?
mRNA incorrect > NMD nonsense mediated decay > mRNA decay to avoid synthesis of non functional or potentially toxic truncated proteins
What are PTCs and what causes them?
Stop codon further upstream than we would expect
Could get introduced via transcription error or a mutation in DNA
What happens during nonsense mediated decay NMD?
mRNA export
Monosome stalled onto PTC
Stalled ribosome and EJC recruit the SURF complex triggering NMD
Truncated polypeptides are ubiquitnated and degraded by proteasome
What are the quality control pathways?
Nonsense mediated decay NMD - premature termination codons
No go decay NGD - stalled in translation mRNAs
Non stop decay NSD - mRNAs without natural stop codons
Most mRNA regulation occurs though the … and … UTRs
5
3
What are the elements of the 5 prime UTR?
IRES
5 UTR structures
Recruitment of RNA binding proteins - eg IRP
Upstream ORF - uORF
What are the elements of the 3 prime UTR?
Recruitment of RNA binding protein
miRNA binding site
How does an upstream ORF uORF repress translation?
Ribosomes dissociate after translating the uORF
Ribosomes stalled by the uORF encoded peptide
What happens under normal translational conditions?
Ternary complex abundant
GCN4 not produced
What is GCN4?
A transcription factor activating more than 50 genes involved in amino acid synthesis
What happens under amino acid stress conditions in translation?
Ternary complex limiting, only authentic AUG recognised
uORF translation impaired by eIF2 alpha P
GCN4 translation promoted
Caudal mRNA
Caudal is an important patterning molecule during embryogenesis
4E HP blocks eIF4F binding and the translation of caudal mRNA
Bicoid
Bicoid (RBP) interacts with the 3UTR of caudal
Bicoid recruits 4E HPn(4E homologous protein) to caudal cap
msl2 mRNA
Is translated in males but not female during development
SXL - sex lethal
And UNR are RBPs that repress msl2 translation by binding the 3 UTR
Prevents 43S binding by locking 5-3 interactions
SXL only expressed in females
What is the epi transcript ome?
Where RNA bases can be chemically modified
m6A
Promotes RNA translation by multiple mechanisms
Reader proteins or METTL3 can bind m6A and promote cap dependant translation
Translation factor eIF3 can be recruited by mRNA directly
M6A controls multiple reversible RNA … processes
Regulatory
5 UTR and 3 UTR contain a variety of regulatory elements that mediate translational ..
Control
RNA binding proteins are key to translational …
Regulation
Translational control impacts many areas of biology including:
Stress response
Early development
Sex determination
What do tRNAs do?
Decode the mRNA sequence during protein synthesis at the ribosome
Ribosomal RNA is the most abundant RNA in the cell!
80% of total RNA
Transcribed by Pol 1 - expect for 5S by pol 3
What are the two eukaryotic ribosome subunits?
40S - 18S rRNA
60S - 5S, 5.8S, 28S
What are the 5.8, 18 and 28S subunits made from?
Single transcript - the 45S precursor
= rRNA processing in the nucleolus
What are snRNAs?
Small nuclear RNAs
U rich sequences involved in the splicing of pre mRNA (U1 2, 4, 5, 6)
Highly expressed and evolutionary conserved
What are small nucleolar RNAs snoRNAs?
Help to process and chemically modify rRNAs
Main function in nucleolus > RNA processing
Encoded with introns of Pol 11 transcribed genes
What RNA modifications do snoRNAs guide?
C/D box - directs 2’ O ribose methylation by recruiting a methyl transferase enzyme
H/ACA box - recruits an enzyme that converts uridine to pseudouridine
Current data suggest that …. Human miRNAs target about 45 000 miRNA sites in about … of human protein coding genes, mainly in 3’ UTRs
5000
60%
What are microRNAs and small interfering RNAs?
miRNAs and siRNAs
Specifically bind to complementary sequences located in 3 UTR regions of mRNAs
How do miRNAs and siRNAs affect mRNA and translation?
miRNAs - promote deadenylation, translation repression and decay
siRNAs - cleavage of mRNA and exosome mediated degradation
Micro RNAd repress translation by … hybridisation with target mRNAs in the cytoplasm
Imperfect
What are the seed sequence miRNA nucleotides that are critical for targeting specific sequences?
2-7
Small interfering RNAs cleave mRNAs upon perfect hybridisation
Defence mechanism against invading double stranded RNA viruses and transposable elements
Crucial in plants, worms and insects
Less in mammals
What does stable dicer gene knock out do?
Eliminates generation of miRNAs in mammals and is embryonic lethal
What does conditional dicer knockout do?
Leads to defects in tissue morphogenesis / development
miRNAs and disease
Control genes with crucial functions in cell proliferation, development, inflammation and ageing
Have been linked to cancer oncogenes and tumour suppressors
Target for bio markers and drug development
Long NON coding RNAs (IncRNAs)
Transcribed by pol 11
Contain 5’ cap and poly A tail at 3’ end
Not translated into proteins
Tissue / cell type specific expression
Involved in cell differentiation, development and anti viral responses
Many functions are unknown!
LncRNAs can interact with … to execute … functions in nucleus or cytoplasm
Proteins, RNA and DNA
Regulatory
Nuclear LncRNAs control chromatin structure / transcription in … or …
Cis
Trans
What is X inactivation specific transcript XIST?
A large 17kb cis acting regulatory LncRNA
Associates with X chromosome
Initiates histone modifications which results in heterochromatin formation
Cytoplasmic lncRNAs have diverse functions including:
microRNA decay
RBP decay
Protein turnover
Organelle function
Signalling
mRNA translation and degradation
How do lncRNAs regulate mRNA stability?
The lncRNA TINCER interacts with complementary sequences and recruits RNA binding protein STAU1
THIS promotes stability of mRNA
How do lncRNAs regulate mRNA translation?
Under stress conditions, lncRNA antisense moves from nucleus to cytoplasm and binds to end of Uchl1 mRNA to promote translation
LncRNAs like NORAD can act as a … for RNA binding proteins and can control cell …
Decoy
Mitosis
What does NORAD +/+ mean?
Normal PUMILIO activity
Normal mitosis
What does NORAD -/- mean?
Hyperactive PULMILIO
Aberrant mitosis
Circular RNAs circRNAd can acts as decoys for miRNAs and RBPS
Generated via back splicing mechanism
Highly abundant and stable
What are the most abundant non coding RNAs?
tRNAs and rRNAs
What do miRNAs and siRNAs conteol?
Gene expression post transcription ally by annealing to sequences in 3’ UTRs of mRNA targets
What does processing of miRNAs / siRNAs involve?
Drosha, Dicer to form the RISC complex
What is the function of lncRNAs in eukaryotes?
Nuclear and cytoplasmic functions in gene expression and control
Does RNA processing happen in prokaryotes?
No!
In eukaryotes, only … of all RNA in the cell is mRNAs!
1.5%
Pre mRNA processing involves what?
Pre mRNAs are capped, polyadenylated and spliced in the nucleus
How are RNA processing factors recruited to pre mRNAs?
They are recruited co - transcriptionally
By carboxy terminal domain CTD of pol 11
Where is the 7 methyl guanosine cap added?
And what does it do?
Added at the 5’ end of eukaryotic pre mRNAs
Protects the mRNA from degradation by nulceases
What does the Spliceosome do?
Catalyses 2 transesterification reactions that joins 2 exons and REMOVES the intron as a Lariat structure
What do the network of interactions between snRNPs and splicing factors determine?
Splice site selection
What do RNA binding proteins (RBP) do?
Bind to specific sequences near the splice site
Regulate alternative splicing (SR proteins > activators > hnRNPs > repressors)
What does cleavage and polyadenylation at the 3’end of pre mRNAs involve?
Diverse RBPs and complexes
Alternative polyadenylation can have important physiological implications (eg antibodies)
What does an mRNP exporter ensure?
Directional export of mRNAs from nucleus to the cytoplasm through the NPC
The mRNP exporters are phosphorylated in the cytoplasm and release the mRNA cargo
RNA processing starts…
Co transcriptionally
The 5’ cap marks RNA molecules as mRNA
What is its function?
Regulates nuclear export of mRNAs
Protects mRNAs from RNA digesting enzymes (5’ exoribonucleases)
Promotes translation through interaction with a translation initiation factor (eIF4E - the cap binding protein)
What is splicing?
The removal of introns from the pre mRNA
What do consensus sequences do?
Define the splice sites in eukaryotic pre mRNAs
What is the spliceosome made up of?
5 small nuclear RNAs - snRNAs
U1, U2, U4, U5 and U6 between 107-210nts long
170 proteins
Exons are generally .. times shorter and more uniform than introns
Since introns can be very long, additional strategies are required to improve … selection
10
Splice site
What is alternative splicing in eukaryotes?
Generating mRNA variants from the SAME gene
What is the most extreme example of alternative splicing?
The Drosophilia DSCAM gene
What are the patterns of alternative splicing?
Exon skipping - most prominent in eukaryotes
Intron retention
Alternative 5’ and 3’ splice site
Mutually exclusive exons
What diseases can mutation of splice sites or splicing factors cause?
B-thalassemia - autosomal recessive blood disorder. Mutation of splice site in B globin gene
Myotonic dystrophy - neuromuscular disease. Depletion of MBNL splicing factor
Cystic fibrosis
Parkinson’s disease
Premature ageing
Cancer
What is the 3’end of eukaryotic mRNAs determined by?
By the processing of pre mRNA
What is the 50-250 nts of adenosine added of the 3’ end of the last exon of eukaryotic mRNAs called?
The poly A tail
Cleavage requires a protein complex.
What does it consist of?
CPSF
CstF
TWO cleavage factors
Poly A polymerase - PAP
What is the function of the poly A tail in eukaryotes?
Required for export of mRNA from the nucleus to the cytoplasm = binding of PABPII
Promotes translation initiation and translation
Stabilises the mRNA > shortening of polyA tail may lead to reduced translation and eventual decay of the mRNA!
How does the processed mRNA get out of the nucleus?
Small molecules and proteins (<60 kDa) can diffuse through membrane
Macromolecular complexes (eg RNPs) need active transport
What does the limited correlation between mRNA and protein levels in the cell provide evidence for?
Post transcriptional control at the global level
Many mRNAs are transported to specific sub cellular locations by what?
RNA binding proteins - that bind to particular elements within the mRNAs (often reside within the 3’ UTRs)
What can translation be controlled globally by?
Initiation factors
More specifically by RNA binding proteins and non coding RNAs
Eg IRP - control of intracellular iron concentrations
mRNAs are preferentially degraded in what?
What is deadenylation followed by?
Processing bodies (P bodies)
Followed by activity of decapping enzymes and exonucleolytic degradation
What is the problem with post transcriptional control by RNA binding proteins and non coding RNAs?
Gene activity / mRNA levels do NOT necessarily correlate with protein abundance
What are the possible fates of an mRNA molecule in the cytoplasm?
Translation (ribosome)
Localisation (organelle)
Storage (granule)
Decay (exo / endo nuclease in P bodies)
What are the reasons for RNA localisation?
Target protein to appropriate region in cell
Prevents expression elsewhere
Response to local requirement (NT production in neurons)
Independent control in different cellar
Regions
Localised synthesis necessary for assembly of protein complexes
More efficient transport (one mRNA molecule vs many proteins)
RNA localisation is … and occurs in … as well!
Universal
Prokaryotes and archea
What are the molecular mechanisms for mRNA localisation? + examples
Directed transport on cytoskeleton - eg neuronal RNPS along microtubuli, ASH1 mRNA bud yeast tip
Random diffusion and trapping - eg drosophilia mRNAs enriched in pole plasma
Generalised degradation in combination with local protection by trapping - same as above
Cis acting elements in the mRNA provide the … for mRNA localisation
Code
What is the ASH1 protein?
It is a transcriptional repressor of HO = No HO transcription and No switching
HO expression is required for mating type switching
How is mRNA delivered to the bud tip?
She2, she3 and myosin4 proteins form a complex with ASH1 mRNA for transport to the bud tip of the daughter cell
What is Khd1p?
A translational repressor that prevents translation of ASH1 mRNAs during transport to the bud tip
How do we visualise localised mRNAs in vivo?
U1A GFP fusion protein is tethered to the mRNA via U1A binding sites > this enables visualisation of mRNA via GFP
What are the 2 major modes of translational regulation?
Global regulation - modification of translation initiation factors or their regulators
MRNA specific regulation - RNA structure / specific RBPs / microRNAs
How do iron regulatory proteins prevent recruitment of small ribosomal subunit?
Bind to iron response element IRE and prevents recruitment of the 43S pre initiation complex via Steric Hindrance
What happens in iron starvation post transcriptional control?
NO ferritin made - ferritin > storage (translation blocked)
Transferrin receptor made - transferrin > iron uptake!
(mRNA stable and translated)
What happens in excess iron post transcriptional control?
Ferritin made - mRNA translated
No transferrin receptor made - mRNA degraded
How does degradation of eukaryotic mRNAs work?
Shortening of poly A tail to <20 bases = stops translation
Degradation by exonucleases
What are P bodies?
Where are they especially visible?
Condensed aggregates of mRNAs and proteins in the cytoplasm
In stressed cells - leading to rapid shut down of translation
Anatomy of a prokaryotic gene
Enhancer
Promoter
Transcription start site
Anatomy of a eukaryotic gene
Enhancer
Promoter
TSS
5’ UTR
start codon
Exons
Introns
Donor, acceptor and splice sites
STOP codon
3’ UTR
Polyadenylation site
What are levels of protein generated controlled by?
Rate of transcription
Rate of mRNA degradation
Rate of protein synthesis
Rate of protein degradation
What does it mean if a small amount of RNA is transcribed?
How will this affect translation?
Slow rate of transcription or a high degradation rate
Small amount of protein will be translated
What is the promoter?
Non coding part of the gene that controls where transcription starts and which direction and what strand it occurs on!
What does the promoter contain?
-10 and -35 consensus sequences, TSS that recruit proteins able to assist in the control of initiation of transcription
What governs protein - DNA interactions?
DNA recognition sequence
Protein structure - can it bind as a monomer / dimer etc
Mediator contents
DNA packaging
What does DNA recognition depend on?
Protein conformation and DNA structure
Why must the protein interface match the DNA shape?
Enables electrostatic interactions
Fine tunes recognition of a specific sequence
Changes to shape of DNA or protein can affect the strength of binding - stronger the interaction, the more it will stimulate transcription! Needs to be a nice stable interaction
What are the types of DNA binding proteins?
Homeodomain containing proteins - eg HOX
Beta sheet recognition protein -eg p53
Zinc finger domain proteins - eg nuclear receptors
Leucine zipper proteins -eg Fos
Helix loop helix -eg Myc
What does gene expression in prokaryotes involve?
RNA polymerase and sigma factors (starts transcription) for progression from the promoter
Prokaryotes: what does the hairpin loop in mRNA do?
Stops RNA polymerase processing DNA = termination
What does gene expression in prokaryotes involve?
Polycistronic (multiple) mRNAs generated from a single gene
Usually proteins / enzymes involved in regulation of a process
Cis regulatory regions and operator regions control initiation of transcription
1 promoter generates multiple genes
What are activators of expression?
Cis regulatory sequences in promoter and enhancer bind the protein
Trans regulatory factors - ie transcription factors
Control recruitment of accessory factors to RNA polymerase, to control initiation of transcription
How does Trytophan repress transcription?
It binds to repressor > ACTIVE! This causes a shape change > can bind to operator site > RNA polymerase can’t bind > all genes under expression of promoter are turned OFF
Operon OFF
Absence of trytophan > operon is ON > RNA polymerase can bind! = transcription
How do Operons control transcription?
Binding of activator to the cis regulatory region recruits RNA polymerase to ‘turn on’ transcription
Operon is ON
Eg: - glucose, + lactose
Why are prokaryotic genomes more simpler?
Polycistronic mRNAs
Prokaryotes have less what?
Less non coding DNA and less elaborate control of gene transcription
What are common features of eukaryotic and prokaryotic gene control?
Consensus sequences
Transcription factors (trans activators)
Ability to inhibit transcription with binding of repressors
Eukaryotic transcription summary
Non coding DNA in the form of promoters and enhancers controls initiation of transcription
Promoters contain consensus sequences to recruit GTF
The components of the mediator complex control initiation through 3D positioning
Why is TFFIID (GTF) essential for transcription initiation by eukaryotic RNA pol 11?
Quite long and attaches to multiple parts of promoter
RNA Pol 11 cannot work without:
Activators
Mediators
Chromatin modifying proteins
The eukaryotic enhancer
Non coding DNA
Cis regulatory and trans activating factors regulate this area
3D EFFECT that brings a complex of proteins together to the form a mediator complex
What are the main readout mechanisms?
The recognition of bases and the recognition of DNA shape
What does the formation of higher order protein - DNA complexes depend on?
Sequence dependant DNA structures that are optimised to promote assembly
DNA binding proteins can move to the nucleus after activation.
What are the steps in this process?
Movement to the nucleus may require a change in shape or dissociation from another protein
The change reveals the NLS - key to bind to Importins
Then allows the protein to bind to Importins to allow it to move through the NPC and into the nucleus
Once in the nucleus it can dimerise with other proteins and DNA
And therefore can affect gene expression by controlling initiation of transcription
How do DNA binding proteins get into the nucleus?
Protein synthesis
Ligand binding
Covalent modification - post translational modification
Addition of a 2nd subunit - eg dimerisation
Unmasking - eg of a chaperone protein that may be covering up NLS
Release from membrane - eg protein kinase C that moves into nucleus
What are Homeobox (HOX) genes?
HOX proteins control body patterning during (foetal) development - eg HOX9 is involved in limb development
They contain homeodomains - contain 3 alpha helices > packed closely together by hydrophobic interactions
P53: a b sheet recognition protein
Typical tumour suppressor gene - stops us having continuous proliferation
Contains a DNA binding protein made of 2 b sheets - almost every mutation that causes cancer happens in the DNA binding domain!
Forms multimers through its OD domain - can modify its DNA sequence specificity
Zinc fingered nuclear hormone receptors
Finger like domains that interact with DNA
Regulate processes like bile acid detoxification - eg FXR
Must bind as a dimer (2 of the same protein)
Contents of the dimer are key to sequence specifity and effect
What are the functions of p53?
Stops the cell cycle but promotes DNA repair
Fos - is a leucine zipper that regulates bone
Always binds to DNA as dimers
Dimer formation enabled by hydrophobic interactions between alpha helices
Have a globular domain with basic leucine zipper (bZIP) - determines if it has open or closed structure
Activated by phosphorylation - by MAP kinase
How is Fos activated?
Signalling cascade of kinases > Fos phosphorylation
Heterodimerisation with Jun enables binding to the AP1 response element
Proliferation of fibroblasts and differentiation
Myc has a role in cancer (it is often up regulated)
Components of the heterodimer control the outcome of the signalling pathways
Myc associated with: cell cycle progression / apoptosis / proliferation / metabolism
Myc can associate with Max. When they bind to M21, they repress genes and promoters!
How does dimer formation affect protein concentration?
If dimer formed 1st: increase number of sites bound on DNA = quickly increases protein concentration
If dimer takes longer to form: protein concentration increases slowly
Bile acids act as …
Retinoic acids act as …
Co repressors
Co activators
What is precision medicine?
An emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment and lifestyle for each person
Ones size fits all approach
How is precision medicine different to personalised medicine?
Personalised: older term with risk of misleading interpretation
Individual, unique treatment
Precision: treatments for groups of people based on genetic, environmental and lifestyle factors
How did we get to precision medicine?
Multiple sources of data
Electronic health records
Collaborative data sharing
Extensive research
Education and awareness in multiple levels
Why is DNA mostly used in genetic testing and not RNA?
Aberrant gene expression or splicing effects in RNA!
How can you genetic test for diagnosis? Eg Mendelian disorders
Sequencing a single gene - eg cystic fibrosis
Sequencing a panel of genes - eg Kabuki syndrome
The clinical exome
WES and WGS
Genetics for screening purposes
Not diagnostic!
To identify high risk sub groups who undergo diagnostic test
Statistically quantifying the performance of a test
At different times of life (prenatal, newborn, adulthood)
Must be evaluated: benefits vs costs
Genetics of complex traits
Eg type 2 diabetes
Complex = multi factorial (polygenic, lifestyle, environment)
Find people at high genetic risk to offer targeted advice
Polygenic risk scores
What are polygenic risk scores?
Sum the risk alleles of multiple variants, usually weighted by their effects as observed in large scale studies
Population specific!
Type 2 diabetes: even with 65 genetic variants the prediction is poor!
Why?
Individual effects are modest
Only 10% of genetic predisposition found
Currently over 500 variants across the genome associated with T2D
What are pharmacogenetics?
How people respond to drugs based on their genetics
Adverse side effects - eg Warfarin
Targeted cancer drugs for specific mutations
Future: the most efficient dose
Warfarin is a drug for people at risk for embolism or thrombosis
What were its side effects?
Decreases availability of vitamin K which is an essential cofactor in blood clotting
Excessive bleeding in people who have low activity enzymes involved in drug metabolism
Gene therapy
To treat or cure a disease by modifying a persons genes - replace with a healthy copy, introduce a new gene
Mechanisms - plasmid DNA, viral and bacterial vectors, gene editing technologies
If targeting RNA transcripts > RNA therapeutics
Somatic vs germline
Tightly regulated
Examples of gene therapy: Haemophilia B
What is it?
A blood clotting disorder causing easy bruising and bleeding
Mutation in the gene causing deficiency of clotting factor IX
Ethical issues of gene therapy
Prenatal screening
Incidental findings - screen only selected genes and not the whole genome
Baby editing
Germline gene therapies
The safety of gene therapies
Costly treatments
Who gets treatment
What are SNPs?
A single nucleotide substitution characterised by its alleles and position in the chromosome
Most SNPs (99%) are bi allelic (two alleles) and uncommon (MAF <5%)
Indels are the … type of variant and typically exist in …
Second most common
Two alleles
The vast majority of SV are …
Insertions, deletions and duplications
SV are responsible for … of all protein truncating events in the genome
25-30%
Up to 4% of individuals carry …
Mega base pair SVs
Copy number variants are the … type of SV in the population
Most frequent
There are over … in the human genome, of which at least 373 000 are VNTR
55 million tandem repeats.
Short VNTR called … occur in multiple alleles and have a high mutation rate
They are … in the human genome
Micro satellites
Abundant
VNTR contribute to … in the expression levels of 12,494 genes through multiple mechanisms
Variation
Over 40 Mendelian disorders , such as Huntington disease, are attributed to …
Most are caused by … of tri nucleotide coding repeats
STR mutations
Large expansions
Haplogroup analysis of … and … can trace the origin of Homo sapiens and their expansion out of Africa
Mitochondria
Y chromosome DNA
… can help estimate when the first humans emerged from the last common ancestor with apes
Molecular clocks of DNA
What have humans developed through out migration?
Adaptations to new environments
What is the human genome?
The genome is the set of all genetic information of an organism
Do you think all genetic info is stored in DNA molecules?
Interspecies somatic cell nuclear transfer example
DNA is the only fundamental molecule that stores genetic info!
Good evidence that the chromosome and DNA are good candidates for being molecules that start genetic information
Also evidence that mitochondria has DNA, but NOT fundamental to define your species!
Trans generational epigenetic inheritance: mouse example.
What happened to the mice offspring?
Mother mouse was induced with stress> didn’t properly groom offspring
So 1st generation didn’t properly groom their offspring
2nd and 3rd generation still didn’t groom offspring
So besides DNA, mice will also inherit behaviour from their mother!
What was the effect of the mother mouse being stressed during pregnancy?
Will have the phenotype of NOT grooming children properly > problem transferred to next generations!
DNA is still the same > but what changed is the way the DNA spreads!
So genome is still the DNA, but the way we read the genome can change depending on circumstances
Mitochondria is always inherited from …
Y chromosome inherited from …
Mother
Father
Why are introns useful?
(26% of genome composition)
Useful as allow recombination to occur
What are transposons?
Mobile DNA elements that can insert themselves into another position of DNA sequence
What is the role of transposons in bacteria?
Are responsible for most antibiotic resistance in bacteria
Can jump from chromosomal DNA to plasmids and vice versa
This facilitates the horizontal transfer of genetic material between species - including antimicrobial resistance genes
What characteristics enable transposons to jump between species?
Transposons include enzymes required for their movement in the coding sequence
They jump to any DNA molecule
Transposed plasmids and viruses may conjugate / infect another organism
New transposons can integrate into the human genome
Why is so much of our DNA transposons?
Because it keeps copying and replicating our genomes!
What prevents transposons from colonising the entire genome?
The human genome is relatively stable because most of the transposons are silent!
Why do defective transposons contribute to junk DNA?
Transposons are also called ‘junk DNA’ because they replicate without an apparent benefit to the host!
What traits evolved from transposons?
The adaptive immune system
The placenta
How do transposons contribute to genetic variation?
Transposition may re arrange DNA
Transposons facilitate horizontal gene transfer between species
Deactivated transposons accumulate mutations over generations
Mutation - processes that increase variation
Examples?
DNA replication and repair errors
Horizontal gene transfer
Transposition
Selection - processes that decrease variation
Examples?
Natural selection
Genetic drift
What is an Indel?
An insertion or deletion of up to 50 nucleotides in a specific location of the DNA sequence
Cystic fibrosis is a clinical example of an indel
What is it caused by?
The 3 nucleotides CTT are deleted in people that develop Cystic Fibrosis
Therefore Phenylalanine is NOT in the amino acid sequence > conformation change in protein > exchange of ions in surface of bronchi is affected > people develop disease later on in adulthood!
What are structural variants? SV
Are large >50bp rearrangements of DNA segments
What is the median number of SV that a human has?
7,439
What characteristic distinguishes an indel from an insertion or deletion?
The length, indels are <50 bp
Why does the Y chromosome become degenerated?
Why does satellite DNA easily accumulate there?
Because recombination doesn’t occur there! (Heterochromatin)
A lot of mutations occur in Y chromosome that don’t occur in other chromosomes > mostly junk!
Autosomal DNA (Chr1 - 22) is inherited from …
Both parents
Mitochondrial and Y chromosome DNA is inherited from …
1 parent only
What is a halo type?
What does halo type analysis reveal?
A group of alleles that are inherited together from the same parental chromosome
Reveals relatedness between individuals
How have Homo sapiens adapted to high altitudes?
Environmental stresses: lower atmospheric pressure, lower oxygen, lower temps and changes in diet
Evolutionary responses: positive selection of variants conferring adaptation to environment
What may Mendelian diseases be caused by?
Example?
Coding and non coding SNPs may have phenotypic effects - ie may cause Mendelian diseases
Eg - Phenylketonuria PKU is an autosomal recessive disease caused by substitution in the sequence of the phenylalanine hydroxylase gene
The first draft of the complete sequence of the human genome…
Included DNA from multiple volunteers
Published in 2001
Human genome project
What do most phenotypes display?
NON - Mendelian patterns of inheritance
Apart from complex traits, how can most inheritance patterns be studied?
Using Punnet tables
How can complex traits be analysed?
With quantitative genetics methods, like heritability
What can twin studies estimate?
Using what?
Phenotypic heritability, without genotyping individuals
Using Falconers Formula
What are Mendelian traits?
Eg pea colour
Traits explained by mutations in a single gene
Do NOT have intermediate phenotypes - are either dominant or recessive!
Phenotype is NOT influenced by environment or epigenetics
What are Co dominant traits?
Eg ABO blood group
Occurs when a phenotype is controlled by a gene with multiple alleles
2 or more alleles are expressed simultaneously
What is a polygenic trait?
Eg eye colour
Causes by multiple genes
Phenotype can show a range of continuous variation
The extremes tend to be less frequent - induces almost a normal distribution
What is Epistasis?
Eg mouse / dog coat colour
Genetic phenomenon when the effect of a mutation in a gene is dependant on mutations on other genes!
What is pleiotropy?
A phenomenon where mutations on a gene affect 2 or more apparently unrelated phenotypes!
Eg a single mutation on the B globin gene may cause sickle cell disease - which is protective against malaria.
Therefore, the SAME gene is a risk factor for sickle cell anaemia and a protective factor for malaria!
What are complex traits caused by?
Genetics and a multiplicity of unknown environmental effects
Cannot be described by a Punnet square
What is heritability? (H2)
The proportion of variance in a phenotype explained by genetic factors
Does not quantify whether a trait is genetic or not
What may single nucleotide substitutions on the coding sequence influence?
RNA and amino acid sequence
What may non exonic SNPs and Indels in the gene body do?
Insert stop codons and disrupt the protein!
Several diseases are caused by … via multiple …
SVs
Mechanisms
Diseases caused by copy number variants:
Velo cardio facial syndromes
Williams Beuren syndrome
Prader Willi and Angelman syndrome
Missesense =
Single amino acid substitution
Nonsense =
Results in a shorter unfinished protein product typically by substitution of an amino acid with a STOP codon
Frameshift =
Results in alteration of all amino acids after an insertion or deletion
Coding and non coding SNPs may cause monogenic and complex disease
An example?
Phenylketonuria - PKU - monogenic disease
Caused by substitutions in the sequence of PAH gene that codes for phenylalanine hydroxylase enzyme
Coding SNPs in exons and non coding SNPs in the introns, 5’ and 3’ UTR of phenylalanine hydroxylase gene causes PKU
GWAS = genome wide association studies
General characteristics?
Similar to linkage studies but in unrelated individuals
Better than linkage studies at detecting weak genetic effects
To identify personal genetic disease risk
To identify pharmacogenetic variants (drug response)
Identity variants contributing to gene environmental interactions
What has enabled GWAS?
Advances in genotyping technologies and lowered costs
Advances in software
Biobanks
Large scale projects - HGP, international HapMap project
What is the focus on in GWAS?
Single nucleotide polymorphisms SNPs
Key concepts in GWAS?
SNP
Minor allele frequency MAF
Hardy Weinberg equilibrium
Linkage disequilibrium
Haplotypes
What is a single nucleotide polymorphism?
A DNA sequence variation occurring when a single nucleotide (A, T, C or G) in the genome differs between members of a species OR between paired chromosomes in an individual at a particular locus
What is the minor allele frequency?
The frequency of the SNPs less frequent allele in a given population
Population specific
What is the hardy Weinberg equilibrium?
Both allele and genotype frequencies remain constant in a population unless specific disturbing influences are introduced
(mutation, migration, non random mating)
Linkage disequilibrium and haplotypes
The non random association of alleles at 2 or more loci so they are inherited together more frequently than expected by chance
Genetic correlation between the markers
Observed in various regions of the genome
Decreases with physical distance
Segments of high LD = haplotype blocks
LD structure is population specific
The TopMed project has identified over 300M SNPs in humans
How could you cover all common variants?
Using tagSNP selection
Quality Control QC
The most important and time consuming part of GWAS!
To reduce systematic bias
What are QQ plots?
Used as a diagnostic tool by plotting observed p values against expected
If p values are close to the grey line (null hypothesis) this implies there a few systematic sources of spurious association
Deviation of small p values from null line indicates possible associations
Log scale helps to emphasise the smallest p values
What are possibilities for an identified association in GWAS results?
- There is a causal relationship between the SNP and trait
- The marker is in linkage disequilibrium with a causal locus
- False positive
There are many potential sources of systematic error in GWAS that might lead to false positive results
How could you overcome this?
Genotyping quality control is particularly important
How could you find the missing heritability in GWAS?
Increase sample size - biobanks
Gene x environment interaction studies
Analysis of rare variants
Analysis strategies increasing power - eg multi phenotype analysis
Post GWAS
Identify the causal variant to understand molecular mechanisms and pathways behind them
Most identified SNPs are in NON coding regions, could still have consequences for nearby genes - enhancer elements,
DNase hypersensitive regions
What is DNA sequencing?
Working out the order of the 4 bases (A, T, C, G) in fragments of DNA
Usually by amplified PCR or DNA cloning
Increasingly sophisticated and affordable
What is RNA sequencing?
RNA is sequenced indirectly through the sequencing of DNA
What is protein sequencing?
Time consuming and requires relatively high amounts of protein
Increasingly replaced by modern methods
DNA sequencing includes single read sequencing and massive parallel sequencing.
Examples of both?
SR:
Maxam Gilbert method - based on chemical degradation. Now obsolete.
Sanger method - based on primer extension chain termination. Still used but dominating method is NGS
MPS:
Next generation sequencing NGS - based on primer extension, fully automated, a revolution in progress
Sanger sequencing
DNA sequencing with chain terminating inhibitors = dideoxy sequencing
Used for sequencing SINGLE genes and fragments of DNA - mutation screening in specific genes, validations of findings from NGS
Highly accurate
What are the steps of pre sequencing?w
- Need to produce multiple copies of DNA - clone DNA fragment into plasmid and grow in E.coli OR amplify DNA fragment by PCR
- Denature the sequence by heating (or adding NaOH) to produce single stranded DNA
- Prepare DNA polymerase, primer, dNTPs and ddNPTs
Primer extension
DNA polymerase will start synthesising a complementary stand
Starting from the primer (adding bases)
Forms a complementary copy to the template strand - extending the primer therefore
What are dideoxyribonucleic triphosphates ddNTPs?
Terminator nucleotides
Modified version of the normal DNA building blocks (dNTPs)
Use the SAME bases (ATCG) but the sugar is modified!
Wherever a ddNTP has been incorporated, DNA synthesis CANNOT proceed any further
How does ddNTP terminate DNA synthesis?
Lack of the 3’ OH group in the deoxynucleotide prevents the formation of the phosphodiester bond
Termination of DNA synthesis process
There are multiple strands due to DNA amplification
There is an excess of normal dNTPs against the amount of ddNTPs (100:1)
These compete against eachother in the DNA synthesis
Termination happens at different places at different strands
The result is a set of DNA sequences varying in length - each ending with a ddNTP
Sanger sequencing
Primer, DNA polymerase and a mix of normal (unlabelled) dNTPs and labelled ddNTPs
Label used to be radioactive BUT today fluorescent dyes are used
The correct order is reached through gel electrophoresis and fluorescence detection
In genotyping where only a SINGLE SNP is genotypes, process is similar but NO normal unlabelled dNTPs are added!
How are nucleic acids separated by size in slab gel electrophoresis?
Nucleic acids carry numerous negatively charged phosphate groups
They will migrate towards the positive electrode when placed in electric field
Porous gel acts as a sieve - small molecules pass through easily than larger fragments
How are nucleic acids separated by size in slab gel electrophoresis?
Nucleic acids carry numerous negatively charged phosphate groups
They will migrate towards the positive electrode when placed in electric field
Porous gel acts as a sieve - small molecules pass through easily than larger fragments
How did Sanger sequencing work in 1977-1985?
Radioactive labelling requiring 4 separate reaction tubes for each ddNTP
Separated individually on large slab electrophoresis
X ray film and dark room manipulation
Manual reading and feeding into computer
1.5 days from setup to results
How did Sanger sequencing work in 1977-1985?
Radioactive labelling requiring 4 separate reaction tubes for each ddNTP
Separated individually on large slab electrophoresis
X ray film and dark room manipulation
Manual reading and feeding into computer
1.5 days from setup to results
How did Sanger sequencing advance from 1986 onwards?
Fluorescent dye for labelling, use of a mixed reaction tube containing ALL 4 ddNTPs
Automated optical detection system using a laser
Direct automated entry of DNA base sequence into computer
Still time consuming due to handling of gel plates
How is capillary gel electrophoresis more advanced compared to slab gel electrophoresis?
Slab gel is manual - slow and prone to human error
Capillary is largely automated
Gel is not run for a finite time, fluorescent labelled DNA samples migrate through gel
Allows longer reads up to 1000 bases - each fragment is allowed to run to bottom of gel where resolution is highest
Faster and cheaper
Pros and cons of Sanger sequencing
Highly accurate sequences - about 99.95%
Several hundred bases long 800-1000bp
Gel electrophoresis is NOT suitable for handling large numbers of samples at a time because it is not fully automated
Therefore not suitable to genome sequencing
Next generation sequencing NGS
Also known as massively parallel sequencing - meaning it sequences millions of DNA fragments at same time
From 2005 onwards - a technological revolution
Vast increase in amount of sequence per data run (the sequence throughput) = dramatic loss in cost
Moving from sequencing single genes and exons to: whole genome and exome sequencing, methyl seq, CHIP seq and ribo seq
What is throughput?
The amount of sequence data processed in one run
What is the read length?
Length of DNA fragments measured in nucleotides
Short read lengths = high throughput
What is read depth?
The sequence coverage - how many times each sequence is represented
Important for small read lengths for genome assembly
What is read depth?
The sequence coverage - how many times each sequence is represented
Important for small read lengths for genome assembly
What is read depth?
The sequence coverage - how many times each sequence is represented
Important for small read lengths for genome assembly
What is genome assembly?-
Aligning and merging the sequenced pieces to make sense of the sequenced genome
What is second generation sequencing ?
(Based on amplified DNA templates)
From short (35 nucleotides) to medium length sequences (up to 800 nucleotides)
High to very high sequence throughput
Quite a high rate of sequencing errors in individual reads
What are the commonly used second generation sequencing platforms?
Roche / 454 pyrosequencing
The ABI SOLID technique
Illumina / Solexa sequencing
Ion torrent systems
Sequencing with emulsion PCR
Uses bead surfaces, water and oil
Allows simultaneous amplification of each sequence without risk of contamination
Each beads acts as a microreactor for PCR, each containing one strand of DNA
The terminators are reversible - after chemical de protection synthesis can continue!
Third generation sequencing (single molecule sequencing SMS)
(Based on unamplified DNA)
Long sequences (thousands of nucleotides) - important for assembly of genomes from newly sequenced species, and for distinguishing large scale variations
Release of protons which is recorded as an electric current
Avoids problems related to DNA amplification - eg under or over representation of DNA
Simple and cheap technology - small portable machines eg Oxford Nanopore
High error rates
Single cell sequencing
Works on cell populations > resulting data are aggregate values
For understanding cell to cell variation and identifying new cell types
A catalog of human cells - stable cell properties / cell positions / lineage relationships
Widely employed in cancer research
Analysis of NGS data
Output files contain millions of short (100bp) reads (2nd generation) or longer reads (3rd generation)
Reads are mapped to reference genome if species is known
Variants are identified
Requires several software tools and computational skills
Sanger vs NGS
Sanger:
Cheap, fast, simple
Highly accurate
Gives an answer about a single, specific question
NGS:
Getting cheaper with more technologies
More error prone
Highly versatile - WGS, WES, RNA seq, methyl seq etc
Computationally laborious
What is a property of DNA polymerase that is important for its proofreading function?
3’ to 5’ exonuclease activity
The role of nucleotide excision repair is to correct error in the DNA introduced by…
Exposure to UV light
What is CORRECT regarding tumour suppressor gene?
Loss of function mutations can inactivate them
Loss of function mutations can contribute to cancer
What is NOT true for cancer cells?
Cancer cells show growers repression by growth suppressive signals
Viruses can infect cells to cause oncogenic transformation.
What is NOT true regarding this process?
Viral borne cancers can be transmissible
Mutations in TCA cycle enzyme: isocitrate dehydrogenase can change its activity
What is the name of the product formed as a result of this reaction?
2 hydroxy glutarate
What element covers the greatest part of the human genome?
Repetitive and transposable elements
Is genetic info found in all organisms?
YES
What cells contain the genetic material?
Zygotes
Gametes
Brain
How many SNPs have been identified in the human genome?
100 million
What is the maximum dosage of a bi allelic SNP?
2
What is a large structural variant?
Deletion
What is a variable number tandem repeat?
A short DNA sequence repeated in tandem that vary in number
A polygenic trait is controlled by …
Multiple variants / multiple mutations
Is it possible to carry genes for a disease you do not have?
Yes!
Eg cystic fibrosis in Europeans - recessive allele carried by 0.8% of population
Is the offspring of bodybuilders naturally more muscular?
No!
Genetic info has not changed
What is heritability?
The proportion of phenotype variance explained by genetic factors
Does heritability indicate the proportion of the trait that is due to genetics?
No!
What protein bridges the 5’ and 3’ of the mRNA through protein protein interactions?
eIF4G
What statement regarding eIF2 is FALSE?
eIF2 is a guanine exchange factor for eIF2B
What is true about eIF4E?
The mTOR signalling pathway can regulate eIF4E activity
What is FALSE regarding the m6A RNA methylation?
m6A and m1A modifications are mutually exclusive on an mRNA
What is FALSE about eukaryotic ribosome subunits?
Contain the same number of rRNAs than those of prokaryotic ribosomes
What is an mRNA quality control pathway?
Non stop decay
What is FALSE regarding mRNA surveillance pathways?
UAG is always a premature termination codon
Vanishing white matter disease …
Is a result of diminished ability to recycle eIF2-GDP to eIF2-GTP
Dosage compensation in flies is achieved by a translational control mechanism mediated by…
Translation repression by SXL
What is FALSE about HRI?
HRI is an ER resident eIF2 kinase that responds to unfolded proteins
Sanger sequencing is used for SINGLE genes
Sequencing process: ddNTPs
Gel electrophoresis
Highly accurate
Next generation sequencing for MULTIPLE genes / gene regions / genome
Second vs third generation
Genome assembly
WGS vs WES vs RNAseq
Analysis of NGS data
GWAS
A hypothesis free approach that has revolutionised complex traits genetics
Testing millions of imputed SNPs for association between the genome and the trait of interest
Results are combined from multiple studies (meta analysis) to provide evidence
Quality control PRIOR to analysis is very important to avoid systematic errors coming from data!
The difficult starts after GWAS to understand the underlying biology
Although successful, a lot of ‘missing heritability’ remains
Precision medicine
Genetic testing for Mendelian disorders
Genetic testing for screening
Polygenic risk scores PRS for complex traits to help prediction
Genetics to help understand response to drugs (pharmacogenetics)
Gene therapy
Challenges and future perspectives
Polygenic risk score graphs
Normal distribution
Population specific
Important to go genetic studies in different ancestries as genetics may differ in different ancestries
Polygenic risk score graphs
Normal distribution
Population specific
Important to go genetic studies in different ancestries as genetics may differ in different ancestries
Polygenic risk score graphs
Normal distribution
Population specific
Important to go genetic studies in different ancestries as genetics may differ in different ancestries
ROC curve
For prediction purposes
Try to maximise area under the curve > the higher it is, the better the test!
Optimal combination of sensitivity and specificity
Factors affecting PCR
Melting temp Tm - temp that dictates the annealing step
GC content - Lower the number of Gs and Cs in the primer, the lower the melting temp!
Salt content - sodium acetate
Other buffer components - like denaturant > may change structure of DNA, may not allow base pairing!
Quantitative PCR
Tells you how much mRNA was there to start with and can compare it to other mRNAs in sample
Quantitative PCR
Tells you how much mRNA was there to start with and can compare it to other mRNAs in sample
Electrophoretic mobility shift assay EMSA
Detects proteins binding to specific sequences (response elements)
Always a band at the bottom because that’s where the probe is and the probe hasn’t bound !
You will never get every piece of DNA bound by protein > some left over
What are ATP dependant processes in epi genetics ?
Nucleosome eviction
Nucleosome sliding
Histone chaperones and ATP dependant DNA remodellers are required for what process?
Nucleosome sliding
What is the function of Histone tail modifications?
To allow transcription factors to bind to a promoter and therefore induce gene expression
To bring arts of DNA closer together in a complex that might enable gene expression to occur
To change interaction of Histones with DNA therefore changing chromatin structure
To work in concert with DNA methylation to manipulate the access of DNA to transcription factors and transcription machinery
What methods enable a scientist to identify areas of DNA that might regulate gene expression?
Hi-C and ChiA-PET
In a ChiP assay, what readout method enables identification of transcription factors part of a complex?
Western Blot
What enzymes carry out de novo DNA methylation which occurs mainly in embryonic development?
DNA methyl transferase 3 (DNMT3a and b)
What modifications can occur at a lysine in Histone tails?
Ubiquitinylation
Di methylation
Acetylation
Tri methylation
What is the name of the mechanism where nucleosomes are removed to transform DNA from heterochromatin to euchromatin structure?
Nucleosome eviction
Deamination of DNA bases can change…
Cytosine to uracil
Cancer cells often display the Warburg effect. It means:
They prefer to carry out aerobic glycolysis even in the presence of oxygen!
What is INCORRECT about proto oncogenes?
Proto oncogenes are derived from tumour suppressor genes
The 2019 Nobel prize in medicine and physiology was awarded for…
How cells sense and adapt to oxygen availability
How kidneys stimulate erythrocyte production by erythropoietin up regulation
How hypoxia inducible factor is stimulated upon low oxygen
How cancer cells form new blood vessels to disseminate themselves
What defines a cancer hallmark
An exclusive characteristic feature of a cancer cell
Retinoblastoma is a … that functions by acting as a … regulator of transcription factor …
Tumour suppressor
Negative
E2F
What is correct about the Her2 gene?
A valine to glutamine mutation in this gene causes constitutive activation
What is the product formed as a result of the mutation in the TCA cycle enzyme isocitrate dehydrogenase?
2 keto glutarate
UV light directly damages DNA by…
Causing thymine thymine dimers
DNA glycosylase regulates the initial step of…
Base excision repair
What is true about Benz(a)pyrene?
It undergoes P-450 mediated activation in lungs causing G to T conversion
What is incorrect in regard to viruses and oncogenic transformation?
Viruses can mutate cellular proto oncogenes into oncogenes
As a sensor of DNA damage, ATM is a … which can … p53 through …
Kinase
Activate
Phosphorylation
What is incorrect about cancer cells?
Cancer cells show growth repression by growth repressive signals
What defines the DNA polymerase proof reading activity?
It also has a 3’ to 5’ exonuclease activity to remove any incorrect base pairing
What signal can activate p53 function?
Lack of nucleotides
What is correct regarding tumour suppressor gene?
Loss of function mutations can inactivate them!
What is incorrect regarding aerobic glycolysis?
Aerobic glycolysis produces HIGH ATP output
What is true regarding the androgen receptor?
It’s a nuclear receptor that drives prostate cancer
Found inactive in cytoplasm and activated by androgen binding
Regulates gene expression in prostate epithelial cells
A target of anti androgen is drugs in the clinic
What is correct about p53?
It’s a tumour suppressor gene that opposes uncontrolled cell growth caused by oncogenes
HIF1a is … under … oxygen conditions by … mediated …
Stabilised
Low
Prolyl hydroxylase
Covalent modification
What is NOT a characteristic of cancer cells?
Limited proliferation
What is correct about 2 hydroxyglutarate. 2HG?
It is a metabolite specific to cancer cells!
Oncometabolite
Loss of function mutations in tumour suppressor genes can result in…
Tumour formation
Inability of a tumour suppressor to inhibit proliferation of a cell with DNA damage
Uncontrolled growth of cells that harbour oncogenes
Loss of balance between cell proliferation and cell death