TEST 1 REVIEW Flashcards
types of repetitive sequences
tandem and interspersed sequences
types of tandemly repeated sequences
satellite DNA, minisatellite DNA, microsatellite DNA
types of interspersed repeated sequences
transposons, MITEs, SINEs, LINEs
satellite DNA
large tandem arrays reiterated millions of times (10’s-100’s bp in size), usually AT rich e.g. centromeres
minisatellite DNA
repeat units up to 25 bp in length, clustered in 20kb groups, e.g. telomeres are TTAGGG x100’s
microsatellite DNA
clusters of 150 bp of repeated units of 2-6 bp, located in euchromatin and generated via slippage, polymorphisms used in genetic profiling
transposons
contain single gene encoding transposase flanked by inverted terminal repeats, ~1-2 kb
MITEs
mini inverted repeat transposable elements – can regulate gene expression by acting as cis-regulatory motifs, palindromic and contain ITRs
SINEs
less than 500 bp, related to retroviruses but do not contain LTRs, transpose through RNA intermediate
LINEs
more than 5 kb, appear to be remnants of retroviruses, contain 2 ORFs encoding 2 proteins (1 being reverse transcriptase), often have degenerate 5’ end
Mechanisms of tandem repetitive sequence generation
replication slippage, unequal crossing over, unequal crossing over, unequal sister chromatid exchange, errors in single strand break repair
replication slippage
generates diversity in short repeats – deletion error correction leads to addition of extra bases, implicated in trinucleotide repeat expansion diseases
unequal crossing over
for longer repeats – unequal crossing of pairs of homologous chromosomes
unequal sister chromatid exchange
one gets both homologous sequences
errors in single strand break repair
during DNA replication
mechanisms of interspersed repeated sequence generation
transposons, retrotransposons, LINEs, SINEs
transposons interspersed repeated sequence generation
transposase makes blunt end cuts in transposon and sticky end cuts in target DNA and ligates transposon in place
retrotransposons interspersed repeated sequence generation
transcribed by RNAPII, processed into mRNA, then reverse transcribed into dsDNA somewhere else in the genome
LINE generation
ORF protein (attached to LINE RNA) nicks genome at AT rich site, reverse transcription primed by chromosomal DNA is completed by ORF protein, insertion completed by cellular enzymes e.g. L1 - Promoter sequences for LINEs direct RNAPII-dependent transcription
SINE generation
5’ end contains RNAPIII promoter, does not code for transposase or integrase as it hijacks LINE machinery e.g. Alu
techniques to measure sequence copy number
PCR, FISH, DNA microarrays
PCR
use primers to amplify alleles in DNA sample, run on electrophoretic gel, determine size – tandem repeats
FISH
fluorescence in situ hybridization – fluorescently labeled DNA probes amplified by PCR are fixed to cells/tissue and observed under fluorescent microscope, used to detect interspersed repeats, measure copy number
DNA microarrays for detection of deletions and duplications
Immobilize probe sequence on chip, extract and shear DNA sample, generate labeled genomic fragments in vitro, hybridize to array, measure intensity with scanner, Comparative genome hybridization - comparing individual to reference
stabilizing selection
both copies retain function until they subfunctionalize
selective pressure on both copies
genes stay similar
selective pressure on one copy
either one copy will degrade or one will acquire a new function
globin loci at different stages of development
Fetal – a2g2, embryonic – a2e2, adult – a2b2, Theory that each globin is derived from one copy and subfunctionalized to optimize oxygen binding
Dlx genes
many species have different copies on different chromosomes, the sequences are conserved but expression profiles now vary considerably
Model mechanisms of divergent gene expression after duplication event
DDC model, whole genome duplication, horizontal gene transfer, de novo creation from random transcription, expression first model, ORF first model
DDC model
changes in expression due to complementary loss of region specific regulatory sequences (a type of subfunctionalization), decreased breadth and increased specificity in expression
example of DDC model
gsb and prd are parologous genes in drosophila, in gsb knockout, prd can rescue the phenotype if expressed using gsb promoter
whole genome duplication
aberrant meiosis at prophase two and fusion of diploid gametes
expression first model
stop codons present in genes then mutates to form ORF
ORF first model
ORF present but only later gains promoter to attract TF
chromatin organization in the nucleus
euchromatin surrounds heterochromatin all over nucleus, with nucleolus in centre, rich protein based matrix facilitates protein-DNA interactions
chromosome territories
chromosomes maintain spatially defined volume, radially organized with some in inner circle and some at periphery but positions are not absolute in all cells
chromosome territories experiment
Photobleaching experiment showed that fluorescently labeled genes in each hemisphere generally retained localization after mitosis
organization of heterochromatin
contains satellite DNA, transposable elements, and some functional genes
position effect variegation
heterochromatic regions can produce variegated expression of euchromatic genes when the two are juxtaposed
constitutive heterochromatin
permanently condensed regions e.g. centromeres and many regions of Y chromosome
facultative heterochromatin
non-permanently condensed regions associated with inactive genes (due to cell specialization)
SAR
scaffold attachment region (attached to scaffold protein, AT rich)
MAR
matrix associated region (AT rich)
functional chromatin domains defined by
DNase I sensitivity or being bound by insulator sequences
3C
technique to study genome organization and structure - chromosome conformation capture – done by looping a segment of DNA, cross linking the portion that connects the loop, digesting the sequence, reversing the cross linkage and amplifying/determining the sequence using PCR
insulators
maintain independence of a functional domain (1-2 kb) by blocking interactions between enhancers and promoter and often include binding domain for CTCF-zinc finger protein (leading to nucleoprotein complex assembly), recruit chromatin modifying enzymes, protect against PEV
nucleosomes
fundamental unit of chromatin containing 2 molecules each of H2A, H2B, H3, H4, H2A.Z; centromeres contain H3 variant cenH3 leading to tetrameric histones (hemisomes)
methods of next generation sequencing
ilumina, transcriptomics, DNA microarrays, DNA chips
ilumina
sequencing by synthesis approach – randomly fragment DNA sample and ligate adaptors to either end of the fragments, bind ss fragments randomly to inside surface of the flow channels, bridge amplification; use modified bases that emit light
transcriptomics
study of full set of mRNAs present in a cell at any given time or conditions (main goals identifying mRNAs present and their relative abundance)
technique to study transcriptomics
RNA-seq
RNA-seq
convert all mRNA into cDNA and sequence that, can be done using DNA hybridization methods
DNA microarrays for NGS
ability to monitor expression of thousands of genes simultaneously by hybridization using glass or nylon surfaces spotted with DNA molecules
DNA chip
same process as microarray but using glass or silicon wafer spotted with an array of immobilized oligos with segments matching each gene
cluster analysis
distinguishing complex differences in gene expression, genes that display similar expression profiles under different temporal or environmental conditions may have related functions, can be grouped by a method of hierarchical clustering where expression intensity is assigned a value that indicates degree of relatedness between the expression levels
complications with transcriptomic analysis
o Transcripts from more than one gene may hybridize to the same probe
o Different mRNAs from the same gene are difficult to distinguish
o Only tells abundance in mRNA which is not necessarily indicative of rate of transcription or rate of transcript degradation or changes in protein levels
ChIP-chip process
isolate and shear chromatin, add antibody specific for acetylated N-terminal tail, immunoprescipitate and release and amplify DNA, fluorescently label, hybridize to chip
ChIP-seq process
isolate and shear chromatin, add antibody specific for acetylated N-terminal tail, immunoprescipitate and release and amplify DNA, fractionate, do NGS
CRISPR-Cas9 editing
RNA guided double stranded DNA cleavage using ssDNA oligo for repair
known uses for CRISPR-Cas9
- Indel creation/repair
- Gene insertion or replacement
- Large deletion or rearrangement
- Gene activation
- Chromatin or DNA modifications
- Imaging location of genomic locus
Class 1 chromatin remodelling enzymes
covalent modification of histones, modifications that indirectly regulate chromatin structure through recruitment of chromatin-associated proteins e.g. histone tail modifications (HAT, HMT, HDAC, HDM)
class 2 chromatin remodelling enzymes
ATP dependent multiprotein remodeling complexes, directly overcoming repressive nucleosomes, use energy from ATP hydrolysis to alter physical properties of nucleosomes so DNA is more accessible (nucleosome displacement) – all have highly conserved helicase-like ATPase domain
histones
o Globular domain that interacts with other histones and DNA
o Flexible N- and C-terminal tail regions that act as substrate for various post translational modifications
histone acetylation regulation in yeast
DNA specific element binds to TF which binds HAT which adds acetyl OR repressor binds HDAC to remove acetyl
ATP dependent chromatin remodeling complexes
SWI/SNF, CHD
SWI/SNF
contains bromodomain that binds acetylated histone
CHD
contains chromodomain that binds methylated histone
nucleosome remodelling
change in structure, sliding – displacement along DNA, transfer – removing and transferring to non-adjacent region of DNA
DNA hypersensitive sites (DHS)
DNase I assays demonstrate that sensitive sites in DNA tend to be upstream of promoter sites
drosophila Hsp70
after heat shock, hsp70 promoter is remodeled by a HAT complex to create a new DHS
mechanisms of nucleosome dependent transcription in yeast
TFs bind and decondense chromatin, enable acetylation of histones, then acetylated histones attract remodeling complexes (SWI/SNF) which remove H2A.Z (histone eviction) and transfer them to chaperone proteins which will recycle them into new histones
regulation of histones during transcription
downstream of loose chromatin, nucleosome occupancy is maintained to prevent inappropriate transcriptional initiation, regions that have already been transcribed are quickly deacetylated because Set2 protein of RNAPII will methylate H3K36 which is recognized by Rpd3 decetylase complex
promoters
integrates all of the regulatory inputs in order to cause transcription to occur, minimum region required for docking of transcription machinery
core promoter
surrounds transcription start site (80 bp) containing features that are sufficient for recognition of transcriptional machinery
proximal promoter
includes additional regulatory information (~300 bp upstream of the core promoter)
experimental characterization of promoters
5’ end deletion experiments, mutation scans, DNase footprinting
5’ end deletion experiments
help define the minimal promoter that is required for transcription
mutation scans
making sequential deletions along regulatory regions and measure level of expression of reporter – for finding precise elements within the promoter that are required
DNase footprinting
take labeled fragments of promoter, digest using DNase I, which will cut all regions of DNA besides where any proteins are bound
consensus sequences
short regulatory DNA elements are highly conserved across eukaryotic species, but there is no universal core promoter – genes contain many combinations of promoter elements within their core promoters
base pair substitution analyses
comparing consensus sequences help define the precise sequence of each functional element in the promoter
types of promoter sequences
TATA box, Inr element, downstream promoter element
TATA box
usually 25-30 bp upstream of TSS, typically recognized by the binding subunit of TFIID
Inr element
pyrimidine, any nucleotide, usually within 10 bp of TSS, can be identified in promoters that may or may not have TATA boxes, >50% of promoters in all animals
downstream promoter element
– usually identified in TATA-less promoters (DPE+Inr common), positioned 50 bp downstream of TSS, not recognized by TBP of TFIID, 40-50% of animal promoters
TF domains
functional and transcription activation
functional TF domains
DNA binding domain, transcription activation domain, other protein interaction domain
transcription activation TF domain
acidic, glutamine rich, proline rich, beta sheet domains
methods of protein DNA interactions
EMSA, SELEX, co-transfection, ChIP
EMSA
electro mobility shift assay, run fragments on a gel then mix fragments with cell bits that you suspect contain proteins then if it runs at a different position on the gel a protein has bound
SELEX
gene regulatory protein of unknown specificity added tot large pool of short DNA double helices, then run on a gel to determine which fragment the protein bound to, then sequence that fragment
co-transfection
in vivo - add two plasmids to yeast at the same time, one with DNA elements and one with protein elements and see if they work together (requires that the cell is unable to be incapable of plasmid endogenous transcription of the reporter gene plasmid
classes of DNA binding domains
basic helix loop helix, leucine zipper, C2H2 zinc finger
basic helix loop helix
specific dimerization region of the domain, basic residues make favorable interactions with negative DNA, helix fits in to the major groove
leucine zipper
basic region responsible for binding, formed by two polypeptides, each one is an alpha helix with leucine spaced 7 residues apart so they all face inside the helix, two monomers form a parallel coiled coil
C2H2 zinc finger
anti parallel beta strand connected to an alpha helix by a short loop, two cysteines in the beta strand and two histidines in the alpha helix coordinate the zinc ion, the alpha helix interacts with the bases of DNA, the beta strand binds to DNA backbone and positions the recognition helix for optimal interaction
nuclear hormone receptors
contain DNA binding domain and ligand binding domain, and depending on how they dimerize will make different types of binding specificties
transcriptional repression
competition, inhibition, direct repression, indirect repression
transcriptional repression - competition
repressor can inhibit binding of an activator to a gene by binding to overlapping DNA sequences – short range
transcriptional repression - inhibition
repressor can bind to a separate DNA site close to/or actually with the activator and prevent the activator from interacting with the other components – short range
direct transcriptional repression
repressor binds to basal transcriptional machinery directly (independently of the activator) – short or long range
indirect transcriptional repression
repressor recruits chromatin modifying factors (E.g. HDACs) – short range or long range
transcriptional repressors recruiting PRCs process
repressors have DNA binding domain and repressor domain which interacts with PRC which transfers methyl groups to the histone tails which signals PRC1 to recognize histone tails which creates very tightly packed repressor complex that stays on chromosome turning things off
Domains of the Polycomb repressor complex
E(z), Eed, Pc, HMT
E(z) domain of PRC
subunit contains Set domain with HMT activity
Eed domain of PRC
critical repressor binding component of PRC2
Pc domain of PRC
contains chromodomains
HMT domain of PRC
other histone methyltransferases that maintain methylation even after repressors are no longer expressed
Enhancers
segments of DNA containing multiple TF binding elements (usually non-coding DNA), enhances transcription from a gene containing a core promoter, variable in size, position independent, usually recognized by activators and act to recruit chromain modifying enzymes
silencers
similar features to enhancers except that silencers repress transcription, sequences are recognized by repressors and act to recruit chromatin modifying enzymes
cis-regulatory modules
contain a variety of elements that can activate or repress transcription, receive complex combinatorial inputs and results in a functionally integrated response
example of cis-regulatory modules
mRNA expression pattern of eve - tested using in vivo transgenic analyses in embryos; modular because they work independently, there is a different one for different stripes
locus control region
how enhancers meet promoters, DNA looping into 3D conformations, may have binding sites for looping proteins that hold together the conformational changes
cis-determinant
carried on same chromosome, linked to the same gene whose expression they affect, e.g. enhancers, silencers, binding sites, promoters
trans-determinant
encoded in other places in the genome e.g. HDACs, HATs, TFs
cis-regulatory code
a particular combination of TF binding sites codes for gene expression
functional conservation without sequence conservation
accounted for by small changes in TF binding sequences, rearrangement of the order of elements within these larger enhancers or other minute changes that are not detected using current computational methods
TBP component of TFIID
recognition of TATA box and possibly Inr element, forms a platform for TFIIB binding
TAF component of TFIID
recognition of core promoter, regulation of TBP binding
TFIIA
stabilizes TBP and TAF binding
TFIIB
intermediate in recruitment of RNAPII, influences selection of TSS
TFIIF
recruitment of RNAPII, interaction with non-template strand
TFIIE
intermediate in recruitment of TFIIH, modulates activities of TFIIH
TFIIH
helices activity responsible for the transition from closed to open promoter complex, possibly influences promoter clearance by phosphorylation of CTD of RNAPII
PIC assembly sequence
o TFIID (TBP and TAF 1-13) recognizes TATA box, Inr, and DPE with TBP subunit which binds to the promoter element, interacts with the minor groove of DNA and bends helix, facilitating TFIIB attachment
o TFIIB adds to TFIID
o TFIIF/RNAPII complex is positioned on top of TSS
o TFIIE binds to create TFIIH docking site
o TFIIH binds and uses ATP hydrolytic helicase activity to unwind DNA, once transcription starts, most components are released but TBP and other TFIID subunits remain for quick facilitation of reinitiation
mediator complex
25-30 separate subunit proteins also highly conserved, stimulates basal RNAPII transcription in vitro and associates with RNAPII to create a stable holoenzyme, activator elements bound to enhancer elements facilitate activity of general TFs by using mediator as bridge, facilitates multiple rounds of transcription from on initial PIC
example of mediator subunit activator specificity
transcriptional activators ELK1 and I1A interact with Med23 at Egr1 but when Med23 is mutated the mediator complex still assembles and acts normally except for Egr1
mediator, Gal10 and UAS in yeast
mediator could interact with the Gal10 while bound to UAS in the absence of functional core promoters and a mutation preventing PIC assembly
Covalent modifications that occur at initiation of transcription
phosphorylation of CTD of large subunit of RNAPII, methylation of H3 lysine 4 by Set1 complex
CTD of RNAPII
composed of tandem repeats of a conserved heptad AA sequence 52 times in vertebrates and 26 times in yeast – TFIIH subunit phosphorylates S5 to recruit capping factors in initiation and P-TEFb phosphorylates S2 in elongation
classes of TSS in mammalian promoters
short and broad
RNAPII elongation
once transcription is initiated, most TFs are released and replaced by TFIIS, Spt5 and other elongation factors involved in RNA processing – perform functions once recruited to phosphorylated CTD
formation of the 5’ cap
RNA 5’ triphosphatase, guanyliltransferase, guanine 7-methyltransferase
functions of the 5’ cap
ensures proper exit to cytosol through binding the CBC, prevents 5’-3’ exonuclease digestion, serves as docking site for translational machinery
TFIIS
limits length of time RNAPII pauses during transcription, helps RNAPII proofread the transcript, Aids in NTP removal, stimulating RNase activity of RNAPII to remove misincorporated NTPs
N-TEF
ATP analog of DRB inhibits transcription elongation so that RNAPII comes under control of N-TEF resulting in a trapped complex near the promoter
NELF, DSIB
In vitro experiments show NELF and DSIF only work combined to slow elongation, and could block TFIIS activity so that RNAPII stays paused
P-TEFb
phosphorylates the Spt5 subunit of DSIF causing NELF release, phosphorylates S2 of CTD which recruits RNA processing factors during elongation
P-TEFb regulation
Regulated by autophosphorylation of Cdk9 C-term, T-loop phosphorylation by Cdk activating kinase, ubiquitilation of cyclin T1 by ubiquitin ligase Skp2, recruited by specific TFs or chromatin remodelling complexes
HEXIM 1/2, 7SK
P-TEFb inhibitors
tat locus
gene encoding HIV sequence specific RNA binding protein
5’ TAR of HIV transcript
has sequences recognized by tat and cellular cyclin t, which positions and activates CDK9 to CTD of RNAPII, allowing efficient transcription elongation
TAR + tat
recruits P-TEFb, can overcome premature termination of transcription by N-TEF