TEST 1 REVIEW Flashcards
types of repetitive sequences
tandem and interspersed sequences
types of tandemly repeated sequences
satellite DNA, minisatellite DNA, microsatellite DNA
types of interspersed repeated sequences
transposons, MITEs, SINEs, LINEs
satellite DNA
large tandem arrays reiterated millions of times (10’s-100’s bp in size), usually AT rich e.g. centromeres
minisatellite DNA
repeat units up to 25 bp in length, clustered in 20kb groups, e.g. telomeres are TTAGGG x100’s
microsatellite DNA
clusters of 150 bp of repeated units of 2-6 bp, located in euchromatin and generated via slippage, polymorphisms used in genetic profiling
transposons
contain single gene encoding transposase flanked by inverted terminal repeats, ~1-2 kb
MITEs
mini inverted repeat transposable elements – can regulate gene expression by acting as cis-regulatory motifs, palindromic and contain ITRs
SINEs
less than 500 bp, related to retroviruses but do not contain LTRs, transpose through RNA intermediate
LINEs
more than 5 kb, appear to be remnants of retroviruses, contain 2 ORFs encoding 2 proteins (1 being reverse transcriptase), often have degenerate 5’ end
Mechanisms of tandem repetitive sequence generation
replication slippage, unequal crossing over, unequal crossing over, unequal sister chromatid exchange, errors in single strand break repair
replication slippage
generates diversity in short repeats – deletion error correction leads to addition of extra bases, implicated in trinucleotide repeat expansion diseases
unequal crossing over
for longer repeats – unequal crossing of pairs of homologous chromosomes
unequal sister chromatid exchange
one gets both homologous sequences
errors in single strand break repair
during DNA replication
mechanisms of interspersed repeated sequence generation
transposons, retrotransposons, LINEs, SINEs
transposons interspersed repeated sequence generation
transposase makes blunt end cuts in transposon and sticky end cuts in target DNA and ligates transposon in place
retrotransposons interspersed repeated sequence generation
transcribed by RNAPII, processed into mRNA, then reverse transcribed into dsDNA somewhere else in the genome
LINE generation
ORF protein (attached to LINE RNA) nicks genome at AT rich site, reverse transcription primed by chromosomal DNA is completed by ORF protein, insertion completed by cellular enzymes e.g. L1 - Promoter sequences for LINEs direct RNAPII-dependent transcription
SINE generation
5’ end contains RNAPIII promoter, does not code for transposase or integrase as it hijacks LINE machinery e.g. Alu
techniques to measure sequence copy number
PCR, FISH, DNA microarrays
PCR
use primers to amplify alleles in DNA sample, run on electrophoretic gel, determine size – tandem repeats
FISH
fluorescence in situ hybridization – fluorescently labeled DNA probes amplified by PCR are fixed to cells/tissue and observed under fluorescent microscope, used to detect interspersed repeats, measure copy number
DNA microarrays for detection of deletions and duplications
Immobilize probe sequence on chip, extract and shear DNA sample, generate labeled genomic fragments in vitro, hybridize to array, measure intensity with scanner, Comparative genome hybridization - comparing individual to reference
stabilizing selection
both copies retain function until they subfunctionalize
selective pressure on both copies
genes stay similar
selective pressure on one copy
either one copy will degrade or one will acquire a new function
globin loci at different stages of development
Fetal – a2g2, embryonic – a2e2, adult – a2b2, Theory that each globin is derived from one copy and subfunctionalized to optimize oxygen binding
Dlx genes
many species have different copies on different chromosomes, the sequences are conserved but expression profiles now vary considerably
Model mechanisms of divergent gene expression after duplication event
DDC model, whole genome duplication, horizontal gene transfer, de novo creation from random transcription, expression first model, ORF first model
DDC model
changes in expression due to complementary loss of region specific regulatory sequences (a type of subfunctionalization), decreased breadth and increased specificity in expression
example of DDC model
gsb and prd are parologous genes in drosophila, in gsb knockout, prd can rescue the phenotype if expressed using gsb promoter
whole genome duplication
aberrant meiosis at prophase two and fusion of diploid gametes
expression first model
stop codons present in genes then mutates to form ORF
ORF first model
ORF present but only later gains promoter to attract TF
chromatin organization in the nucleus
euchromatin surrounds heterochromatin all over nucleus, with nucleolus in centre, rich protein based matrix facilitates protein-DNA interactions
chromosome territories
chromosomes maintain spatially defined volume, radially organized with some in inner circle and some at periphery but positions are not absolute in all cells
chromosome territories experiment
Photobleaching experiment showed that fluorescently labeled genes in each hemisphere generally retained localization after mitosis
organization of heterochromatin
contains satellite DNA, transposable elements, and some functional genes
position effect variegation
heterochromatic regions can produce variegated expression of euchromatic genes when the two are juxtaposed
constitutive heterochromatin
permanently condensed regions e.g. centromeres and many regions of Y chromosome
facultative heterochromatin
non-permanently condensed regions associated with inactive genes (due to cell specialization)
SAR
scaffold attachment region (attached to scaffold protein, AT rich)
MAR
matrix associated region (AT rich)
functional chromatin domains defined by
DNase I sensitivity or being bound by insulator sequences
3C
technique to study genome organization and structure - chromosome conformation capture – done by looping a segment of DNA, cross linking the portion that connects the loop, digesting the sequence, reversing the cross linkage and amplifying/determining the sequence using PCR
insulators
maintain independence of a functional domain (1-2 kb) by blocking interactions between enhancers and promoter and often include binding domain for CTCF-zinc finger protein (leading to nucleoprotein complex assembly), recruit chromatin modifying enzymes, protect against PEV
nucleosomes
fundamental unit of chromatin containing 2 molecules each of H2A, H2B, H3, H4, H2A.Z; centromeres contain H3 variant cenH3 leading to tetrameric histones (hemisomes)
methods of next generation sequencing
ilumina, transcriptomics, DNA microarrays, DNA chips
ilumina
sequencing by synthesis approach – randomly fragment DNA sample and ligate adaptors to either end of the fragments, bind ss fragments randomly to inside surface of the flow channels, bridge amplification; use modified bases that emit light
transcriptomics
study of full set of mRNAs present in a cell at any given time or conditions (main goals identifying mRNAs present and their relative abundance)
technique to study transcriptomics
RNA-seq
RNA-seq
convert all mRNA into cDNA and sequence that, can be done using DNA hybridization methods
DNA microarrays for NGS
ability to monitor expression of thousands of genes simultaneously by hybridization using glass or nylon surfaces spotted with DNA molecules
DNA chip
same process as microarray but using glass or silicon wafer spotted with an array of immobilized oligos with segments matching each gene
cluster analysis
distinguishing complex differences in gene expression, genes that display similar expression profiles under different temporal or environmental conditions may have related functions, can be grouped by a method of hierarchical clustering where expression intensity is assigned a value that indicates degree of relatedness between the expression levels
complications with transcriptomic analysis
o Transcripts from more than one gene may hybridize to the same probe
o Different mRNAs from the same gene are difficult to distinguish
o Only tells abundance in mRNA which is not necessarily indicative of rate of transcription or rate of transcript degradation or changes in protein levels