Sudbery Flashcards
Why do we care about transcriptomics?
- 98.5% of protein coding seq is the same human to mouse
- 1-2% of the genome is coding (hence why we are so diff from mice)
- metazoan genomes are not selected for size → much is repetitive seq for decaying pseudogenes
- every cell has the same DNA, but cells are diff –> dep on what genes are active/exp levels
What did ENCODE find about ‘junk’ DNA?
- most of what was thought to be junk has a function in controlling something
How much of the genome did ENCODE claim was functional?
- 80% (CRMs)
What are cis-regulatory modules?
- inc promoters, enhancers, silencers and insulators
- regions of DNA that bind DNA BPs (eg. TFs) and reg gene exp
What seq motifs do DNA BPs bind, and what is the result of this?
- bind degenerate sequence motifs
- binding sites vary, but certain seqs are more likely
- but just because seq is present doesn’t mean it will bind
- eg. 8 mil GATA1 binding sites in the genome, only 0.2% bound by GATA1 (ChIP-seq)
Does all DNA exist as heterochromatin or euchromatin?
- no, sliding scale
What is hetero and euchromatin?
- heterochromatin = tightly packed
- euchromatin = loosely packed
What regions tend to be nucleosome free, or have v few nucleosomes?
- CRMs
How tightly packed is chromatin in transcribing genes?
- intermediate
How can nucleosome free regions of genome be mapped?
- map w/ DNase-seq
- DNase only cuts where there are no nucleosomes, can use to build up genome wide pic of where nucleosome free regions are
What did ENCODE measure and how?
- RNA expression –> RNA-seq, CAGE-seq and RNA-PET
- DNA/protein interactions –> ChIP-seq
- chromatin accessibility –> DNase-seq and FAIRE-seq
- 3D structure –> ChIA-PET and 5C
- methylation –> RRBS
How does ChIP-seq work?
- prots bind to DNA, and use crosslinking to see where binds to DNA, chop up DNA and use Ab to select for DNA which is crosslinked to a prot, so can separate this DNA, seq it and work out where in genome prot binds
What assays were carried out on what cells in ENCODE?
- tier 1 = all assays
- tier 2 = a selected subset of assays
- tier 3 = everything else, eg. a specific assay or combination
What did ENCODE prod?
- lots of data sets and continues to gen new data
What did ENCODE claim?
- vast majority (80.4%) of human genome participates in at least 1 biochemical RNA and/or chromatin-assoc event in at least 1 cell type
- 19.4% covered by at least 1 DHS or TF ChIP-seq peak across all cell lines