The regulatory Genome - Week 3 Flashcards
regulatory genome
non-coding genome
in humans the non-coding genome represents
99% of the genome
only 1% of the genome is
coding
Evidence for function in the non-coding genome
biochemical evidence, genetic evidence, evolutionary evidence,
Evidence for function in the non-coding genome: biochemical evidence
Genome wide studies i.e. ENCODE - identifying molecular activity (transcription, transcription factor binding), up to 80% of the genome functional
Evidence for function in the non-coding genome: genetic evidence
Function defined by - phenotypic consequence of mutation, up to 15% of the genome ‘functional’
Evidence for function in the non-coding genome: evolutionary evidence
conservation of sequence across evolution, up to 10% of the genome ‘functional’
What does the non-coding genome contain?
Promoters, enhancers, silencers, insulators, noncoding RNA agents (microRNAs, piRNAs, structural RNAs, regulatory RNAs)
Non-coding functional elements associated with
distinctive chromatin structures that display signature patterns of histone modifications, DNA methylation, DNase accessibility and transcription factor occupancy.
Enhancers - how are they defined
A set of clustered short sequence elements that stimulate the transcription of a gene and whose function is not critically dependent on their precise position or orientation
Enhancers - mechanism of action
serve as sites for pre-initiation complex (PIC) formation (complex of dozens of proteins including RNA polymerase and general transcription factors necessary for transcription.)
By looping of the intervening DNA to make contact with the target gene promoter the PIC is delivered to the required site and transcription proceeds.
This loop formation might increase the concentration of PIC at the target gene and thus transcription. This is one model whereas there is also evidence for other models:
• Looping results in relocation of target gene within nucleus to a ‘neighbourhood’ within the nucleus more favourable for transcription (so called transcription factories).
• Looping results in recruitment of transcription elongation factors to promoters that can increase rates of transcription elongation (the step after transcription initiation)
Enhancers - what evidence is there that they are important to human health?
- Genetic evidence can be provided by finding mutations causing disease. For Mendelian disease mutations in a limb enhancer for SHH is a prototypical enhancer.
- Single nucleotide polymorphisms associated with risk of developing complex, polygenic disease have also been shown to be enriched in disease-relevant cell type enhancers (e.g. type 2 diabetes and pancreatic islet enhancers)
- Evolutionary conservation of enhancer sequences is also good evidence of importance to human health
- As is evidence of paucity of rare variation across human populations in human accelerated regions (regions with evidence of accelerate divergence from other primates and thus likely important for human-specific traits) Most HARs are believed to be enhancers.
Enhancers - What methods are used to identify them?
- Chromatin Conformation Capture (to capture looping interactions)
- Reporter Gene assays (e.g. luciferase) to show ability of sequence to increase expression of reporter gene transfected into cells
- ChIP-Seq (identify DNA binding sites for transcription factors and DNA associated with histone modifications associated with enhancer function (e.g. H3K4me1)
- DNase-Seq (treat DNA with DNase I that cleaves DNA in open chromatin (e.g. enhancer) and sequence these fragments)
Enhancers - What are the current and future directions for research? (ie what remains to be found out and why is this important?)
- Cell sorting to identify spatially and temporally restricted enhancers (e.g. Human Cell Atlas)
- Is there an enhancer code? (the sequences of enhancers does not give good discrimination but could other parameters (ChIP-Seq binding) provide better discrimination?
- To what extent can human disease (Mendelian and complex, polygenic) be explained by genetic variation in enhancers?
Insulators - how are they defined?
DNA element that acts as a barrier to the spread of chromatin changes or the influence of cis¬-acting elements
Insulators - What is/are there mechanism of action?
- Insulator sequences have been identified by their ability to block the activity of enhancers to increase gene expression when placed between promoter and enhancer. This may be by preventing the spreading of heterochromatin or the looping and contact between enhancer and promoter.
- Insulators have been shown, in vertebrates, to be enriched in CCCTC sequence and bind CCCTC-binding factor (CTCF). CTCF plays a key role in establishing topologically associating domains (TADs). Interactions within TADs are enabled whilst between TAD interactions are prevented
Insulators - What evidence is there that they are important to human health?
- CTCF-binding sites that define insulator sites are frequently mutated in cancer
- Microdeletions in the an insulator sequence that regulates IGF2 and H19 imprinting results Beckwith-Wiedemann syndrome
Insulators - What methods are used to identify them?
- Reporter genes inserted either flanked by insulator sequences or not. If flanked sequences are insulating then the reporter gene will be active whether inserted into heterochromatin or euchromatin.
- The ability of an insulator sequence to block promoter: enhancer interactions can be assayed by placing the tested sequence in between promoter and enhancer.
Insulators - What are the current and future directions for research? (ie what remains to be found out and why is this important?)
• In species such as Drosophila many more proteins that bind insulators are known than for humans. Are there more insulator proteins in humans and will this explain the lack of CTCF-binding sites at some topologically associating domains (TADs)?
Non-coding RNAs - How are they defined?
Mature RNA transcript that is not translated to make a polypeptide
Non-coding RNAs - What is/are there mechanism of action? List the types of non-coding RNA
MicroRNAs (miRNAs), Transfer RNAs (tRNAs), Ribosomal RNAs (rRNAs), small nuclear RNAs (snRNAs), long non-coding RNAs (lncRNAs), Circular RNAs (circRNAs), Piwi-interacting RNAs (piRNAs)
Non-coding RNAs - What is/are there mechanism of action? MicroRNAs (miRNAs)
small (~20nt) single-stranded RNAs that bind, via complementary base pairing, to transcripts and cause translational repression and/or mRNA degradation
Non-coding RNAs - What is/are there mechanism of action? Transfer RNAs (tRNAs)
adaptor RNAs with a classic cloverleaf structure that carry amino acids covalently linked to their 3’end and also containing an anticodon sequence that base pair with the codons on mRNAs. Hence crucial for translation.
Non-coding RNAs - What is/are there mechanism of action? Ribosomal RNAs (rRNAs)
Ribosomes are protein:RNA macromolecular complexes that catalyse the reaction of joining amino acids together. rRNA makes up ~80% of RNA (by mass) in a human cell.
Non-coding RNAs - What is/are there mechanism of action? o Small nuclear RNAs (snRNAs)
~150nt long RNAs that most famously bind to regions near splice sites and make up part of the spliceosome, the complex that removes introns and joins exons together in the process called splicing.
Non-coding RNAs - What is/are there mechanism of action? Long non-coding RNAs
long RNAs (>200nt) can play crucial roles in regulating gene expression by recruiting chromatin-modifying complexes to modify histones and/or DNA methylation.
Non-coding RNAs - What is/are there mechanism of action? Circular RNAs (circRNAs)
back splicing’ results in formation of circular RNAs that may sequester miRNAs and thus regulate gene expression.
Non-coding RNAs - What is/are there mechanism of action? Piwi-interacting RNAs (piRNAs)
active in germ cells to repress transposon (mobile DNA) activity and crucial for germ cell and stem cell development.
Non-coding RNAs - What evidence is there that they are important to human health?
• Their involvement in ubiquitous cellular processes (e.g. protein synthesis (tRNAs & rRNAs), splicing (snRNAs) and gene regulation (miRNAs, lncRNAs)). Their abundance and numbers also argue for importance
Non-coding RNAs - What methods are used to identify them?
• Size or characteristic selection of RNA, conversion to complementary DNA and NGS.
Non-coding RNAs - What are the current and future directions for research? (ie what remains to be found out and why is this important?)
- Projects such as the Human Cell Atlas will reveal whether low-level expression of ncRNAs in groups of cells/tissues is due to a small number of cells expressing high levels or generally low level expression across all cells.
- Cell-specific expression and involvement in disease pathology offer tremendous potential for using ncRNAs as biomarkers and to be targeted by therapeutics (see table below).
Promoters - how are they defined?
• A combination of short sequence elements, usually just upstream of a gene, to which RNA polymerase binds so as to initiate transcription of the gene
Promoters - what is there mechanism of action?
- General transcription factors sequentially assemble at the core promoter to recruit and activate RNA polymerase to unwind DNA and begin transcription.
- General transcription factors may bind to TATA box or BRE elements shown below. But it is important to note that not all promoters contain these sequence motifs (i.e. there presence and spacing is sufficient but not necessary).
Promoters - what evidence is there that they are important to human health?
- Mutations in promoter sequences of many genes are known to cause a plethora of Mendelian diseases
- Indicative of importance binding of transcription factors, the binding sites and the characteristic epigenetic marks are highly conserved from mouse to human
Promoters - What methods are used to identify them?
- A reporter gene assay with the candidate promoter region upstream of the reporter gene is truncated and sometimes split to identify smaller regions that can recapitulate most or all the transcriptional activity
- More high-throughput genome-wide techniques to identify transcriptional start sites and core promoters are now in use, some of which make use of the chemical structure of the 5’cap present at the start of mRNAs
Promoters - What are the current and future directions for research? (ie what remains to be found out and why is this important?)
• To predict the phenotypic consequences of changes in transcription factor activity it is necessary to understand how certain transcription factors are important to promoter but not enhancer function. Even greater throughput assays are required to understand the transcription factors and binding sites necessary in different cell types.
Silencers - How are they defined?
• Combination of short DNA sequence elements that suppress the transcription of a gene
Silencers - what is there mechanims of action?
• Silencers recruit negative transcription factors (so called repressors) and co-repressors including Polycomb repressive complex. In another cell type the same sequence may actually act as an enhancer
Silencers - what evidence is there that they are important to human health?
- Lack of genome-wide approaches to identifying silencers means there are only isolated examples of their importance (e.g. Deletion of D4Z4 repeats affecting expression of genes at chromosome 4q35 causing Fascioscapulohumeral Muscular Dystrophy (FSHD)).
- The dual activity model for sequences that can act as silencers in one cell type and enhancers in another mean that much of the evidence for importance of enhancers could also apply to silencers (e.g. evolutionary conservation, mutations causing disease) but this remains to be determined.
Silencers - What method is used to identify them?
- Historically, reporter gene assays have been used to identify silencers. In these a sequence with putative silencer activity is inserted upstream/downstream (like enhancers they are orientation-independent) of a reporter gene being driven by a relatively strong proximal promoter
- Very recently, a more high-throughput method has been published which involves transfecting cells, stimulated to apoptose, with putative silencer sequences and identifying cells that survive (due to the silencing of the pro-apoptotic Caspase-9 gene)
Silencers - What are the current and future directions for research? (ie what remains to be found out and why is this important?)
- As eluded to genome-wide identification of silencers spatially and temporally will increase our understanding as to how important they are and their overlap with enhancers.
- This in turn will enable the phenotypic effect of genetic variation (somatic and germline) in these elements to be better understood.