MARCO - chromatin + RNA-protein Flashcards
Chromatin
very dynamic – require interaction of TF at diff. enhancers, promotors, and many regulatory factors over long distance, forming 3D interaction of chromatin
- The interactions are limited to specific regions that form the same TAD (transcription active domains)
- The components in 1 TAD do not interact with other TADs. Enhancers and promotors can only regulate genes inside the same TAD (with very few exceptions)
- DNase-seq, etc identify regulatory regions but not the distant interactions
ChIA-PET: Chromatin Interaction Analysis with Paired-End Tag
*Identification of 3D Chromatin interaction technique:
- Based on the assumption that DNA fragments that are in close proximity due to 3D chromosome interaction are more likely to re-ligate with each other. (Not naturally close to each other but since interacting, end up being near each other)
*Other similar techniques: 3C, 4C, HiSEQ
ChIA-PET: (method)
- Cross-linking of TF bound chromatin which also has long-distance interaction with other GSTF on distal enhancers, then shearing
- Immunoprecipitation of region bound by a selected TF to also pull down the distal DNA region associated with the TF
- Proximity ligation of interacting regions forming paired-end sequences
- De-crosslinking of the proteins from DNAs, PCR amplification, and purification
- Next-generation sequencing obtaining reads of paired-end sequences
- Mapping back to genome to identify distal interacting regions
RIP-seq:
RNA ImmunoPrecipitation
- Identification of RNA-protein/ RNA-DNA interactions
ex. long non-coding RNA, proteins regulating RNA post transcription
RIP-seq: method
- Immunoprecipitation of a specific RNA-protein complex using antibody that recognizes a specific RNA binding protein (RBP)
- RNase digestion of unbound regions & degradation of RBP
- Extraction of bound RNA
- Reverse transcription into cDNA
- Next-gen sequencing on cDNA & mapping to reference genome
RIP-seq: problem
cannot perform cross-linking to keep the RBP associated with RNA because it would be too damaging to the RNA - results in low stringency & specificity
CLIP-seq:
Cross-linking ImmunoPrecipitation
Relies on UV cross-linking: treating RNA-protein complex with UV light to immobilize the interaction for an amount of time - results in increased stringency during immunoprecipitation
CLIP-seq method
- UV cross-linking
- Immunoprecipitation
- RNase digestion of unbound regions & degradation of cross-linked RBP
- Reverse transcription into cDNA
- Next-gen sequencing on cDNA & mapping to reference genome
* The process of degrading the cross- linked proteins utilize proteinase K which is not able to digest the whole protein (leaves a peptide at the binding site)
* During reverse transcription, the peptide that is left behind will induce mutations, resulting in cross-linking induced mutation sites – CIMS
* Therefore, the cDNA will not be identical to the original DNA sequence that coded for the bound RNA.
* When mapped back to the reference genome, the CIMS can be identified to indicate protein-RNA binding sites
Clip-seq variants
- PAR-CLIP - improves crosslinking with photoreactive RNA nucleotides
- iCLIP - uses reverse transcriptase stalling to map individual nucleotide-protein
interactions - miCLIP - modifies an RNA methylase to map its binding sites
-seq method key features
- identify region of interest (protein bidning site, open chromatin, etc)
- isolate sequence eg. fragmentation and immunoprecipitation, phenol/chloroform, etc)
*sequence fragment and map back to genome
ENCODE – Encyclopedia of DNA elements
aims to identify all functional elements in the human genome involved in gene regulation across ~50-60 cell types using these -seq methods
ENCODE controversy
Identified millions of enhancers/ 100s, 1000s of promotors/ millions of transcription factor binding sites across genome which correspond to a big proportion of genome – cause controversy during the earlier times (now known to be true)
Limitations of ENCODE
Most of the work was conducted in immortalized cell lines – cell lines derived from human cancers (ex. HeLa cells):
* cancer cells model tissue of origin BUT are not the exact same as the primary cell type
* Identification of those regions from a primary cell type (closer to natural state) is
preferred – partially accomplished by a recent consortium called ROADMAP – identified histone modifications across 25 human tissues
both ENCODE and ROADMAP are ongoing which keep adding information to our knowledge of these regulatory regions