Lesson 5 Flashcards
what was the project created to investigate the “junk” DNA?
the Human Encyclopedia → ENCODE
what does ENCODE stand for?
encyclopedia of DNA elements
what was project Consortium of ENCODE focused on?
a specific 30 mega base sequence organized as an international consortium of computational and laboratory-based scientist working to develop high throughput approaches for detecting all sequence elements that have biological function
what are functional elements?
a discrete genome segment that encodes a defined product or displays a reproducible biochemical signature
how does junk DNA work differently than other genes?
genes tend to be highly conserved across species while these regulatory elements are not as conserved
what are four major reasons scientists had a hard time identifying junk DNA?
- junk DNA works differently than other genes
- functional elements are made of small or fragmented sequences that can be interrupted with other unspecific sequencing
- can lie in repetitive regions of the genome (hard to recognize)
- evolve very rapidly or else they can be nearly neutral to evolutionary processes (they become many different things during evolution without having any constrictions)
describe DNA methylated regions:
regiones layered with chemical methyl groups which regulate gene expression
Describe open chromatin:
areas in which DNA and proteins that make up chromatin are accessible to regulatory proteins
What are RNA binding sites?
positions where regulatory proteins attach to RNA
why were RNA sequences an experimental target?
regions that transcribed into RNA
why were CHIP-SEQ experiments performed?
revealed where proteins were bound to DNA
describe modified histones:
histone proteins which package DNA into chromosomes were modified by chemical marks
What are transcription factors?
proteins that bind to DNA and regulate transcription
what does it mean if there are “local micro-environments in culture”?
there could be a lot of variation across different places that might cause bias
describe an enhancer:
not stable features of the genome → undergo epigenetic changes
what are two main challenges of ENCODE?
massive amount of work (have to look for 18,000 transcription factors in each cell type) and each cell type may exhibit a diverse array of responses to exogenous stimuli (environmental conditions or chemical agents)
in a Manhattan plot, what does the p-value indicate?
the association between a certain locus in the genome with a certain disease
in a Manhattan plot, describe the meaning of the data surrounding the line:
data above the line is significant, data below is not
what result is a manhattan plot used to show?
GWAS result - genome wide association studies
what do most GWAS identify?
an association between the disease trait and a surrogate marker (tag SNP) rather than a variant - only 3% of variants in the genome of a patient affected with a certain disease are located within genes
what are single nucleotide polymorphisms (SNPs)?
enriched with non-coding functional elements with a majority residing in or near ENCODE-defined regions that are outside of protein coding genes
when CHIP-SEQ analysis was performed, what was discovered?
found that in various cell lines there were very strong binding sites for GATA2 and in their case for open chromatin (DNase 1) - association with some SNPs that GATA2 was binding
what was one major finding of ENCODE related to every gene in the genome?
almost every gene gets alternative splicing
when does splicing occur?
during transcription → when the RNA is still bound to DNA, so its a local effect and splicing occurs immediately
describe genes with relation to isoforms:
genes tend to express many isoforms simultaneously reaching a plateau at approximately 10-12 expressed isoforms per gene
describe the expression when there are many isoforms present:
one is usually most predominant in a given condition and captures a large fraction of total gene expression
what is one reason why an alternative 3’ UTR is important?
to define how proteins are localized in the cell
what two types of UTR does Cd47 have?
long and short
what happens when CD47 has a short 3’ UTR?
protein remains inside the cell and doesn’t translocate to the membrane because the tail doesn’t have anything to bind to
what happens when CD47 has a long 3’ UTR?
it binds to the RNA itself and few components, and then two other proteins bring the whole set to the surface
in neurons, where do long and short 3’UTRs drive the RNA?
a long poly A drives RNA to remain in the soma and a short poly A drives RNA to the axon
what does the 3’ UTR act as?
a scaffold to regulate membrane protein localization
besides localization, what else is a shorter 3’ UTR associate with in cancer biology?
increased proliferation
how does a shortening of the 3’ UTR effect cancer cells?
activates oncogenes in cancer cells by alternative cleavage and polyadenation
why is a shorter 3’ UTR associated with proliferation?
oncogenes choose a shorter poly A so that less microRNA can bind to it and prevent translation / induce eliminaiton
how can proteins and transcription factors be compared?
they both always work together in complexes - most transcription factors have a nonrandom association to other transcription factors
where are transcription factors found?
on the promotor of actively transcribed genes and intergenic regions (more often)
what might transcription factors be bound to in intergenic regions?
maybe bound to regulatory regions exerting some sort of activity
what three categories can chromatin modifiers be divided into?
writers, erasers, and readers
describe the specificity of chromatin modifiers:
one writer can add many different residues and erasers can remove many different modifications as well → not so specific
what is another name for the three chromatin modifiers?
histone modifying genes
do histone modifying genes just act on genes inside of the nucleus?
no → some of the sequences of histone 3 are shared by other proteins that have nothing to do with histones
what is the most frequent histone modification of promotors (if a promotor gene is active, which methylation are we looking for)?
its the (3) (me)thylation of (lysine) residue number (4) in (h)istone (3) (H3K4me3)
what is K4 almost always associated with?
active transcription of genes
where does H3K4me3 occur?
trimethylation: on promotors of active genes
what is the H3K9me3 methylation?
classical heterochromatin marker → represses chromatin
what is the H3K27me3 methylation?
associated with repressed genes
what is ATAC-seq used for?
a technique used to find all the chromatin that is open
what does it mean if the chromatin is open?
if chromatin is open it implies a region where there is active transcription
when do transcription factors bind?
only when the chromatin is open