euk genome architecture - barnes Flashcards
define c-value
amount of dna in a haploid nucleus for a given species
define c-value paradox
complexity of an organism doesnt correspond to the genome size
compare human and yeast genome architecture
human genome much less gene dense than yeast - much more space devoted to introns and repeat sequences
how does genome length act as a bacterial selection pressure?
the longer a bacteria’s genome the longer it takes to reproduce - therefore less pressure for mammals to remove ‘junk’ dna
define micro/minisatellites and satellite DNA
• Microsatellites 1-13bp long, less than 150 repeats • Minisatellites 14-100bp long 1-5kb tandem arrays in the genome • Satellite DNA: 100-500bp Especially important at mammalian centromeres
name two mechanisms for different lengths of satellite dna between individuals
polymerase has a tendency to fall off at repetitious seq’s - can either increase or decrease repeat length
unequal crossing over in meiosis - repetitive dna conses alignment - one gamete will get more repeats and one will get less
name a case were extra satellite dna isnt phenotypically normal
trinucleotide repeats of CAG (gln) result in proteins that are degraded (these proteins form toxic accumulations in neurons and stop them functioning)
this is huntingtons - dominant
more repeats = greater risk to offspring
how does dna fingerprinting work?
use polymorphism in minisatellite length between indivs to identify them
Steps:
extract dna
digest with restriction enzyme
separate fragments on gel
southern blot using minisatellite as probe
observe characteristic bands for each individual
do this at no. of loci
how can pcr be used instead of a southern blot for dna fingerprinting?
use pcr primers that anneal to conserved seq around minisat
name the 2 types of Tn
cut and paste (nonreplicative)
copy and paste (replicative)
briefly describe Tn structure
genes (sometimes incl. transposase) surrounded by inverted repeats, surrounded by direct repeats
why are direct and inverted repeats present in Tns?
direct - created by insertion into target site (as transposase makes staggered cut these are filled in by pol)
inverted - target site of transposase
how can Tn copy number increase?
during s-phase after a Tn has been replicated, it can cut itself out and paste itself ahead of replication machinery thereby copying itself
describe the difference in activator and dissociation Tns
activator - encoded own transposase
dissociation - requires transposase encoded elsewhere
what are P-elements? which animal are they present in? what do they do?
P-elements = type of Tn
in drosophila melanogaster
they mutate at very high rates causing v high mutation rates and infertility
how can P-elements be silenced?
In P+ female flies the embryo cytoplasm has a silencing mechanism
describe the outcome of a P+ (Male) and P- (female) cross
no silencing in embryo cytoplasm
high transposition in the germline - gene disruption
unsuccessful cross
describe the outcome of a P+ (female) and P- (male) cross
silencing in embryo cytoplasm
no transposition
successful cross
what are retrotransposons? give 3 examples
what is their mechanism of transposition
Tn that moves via rna intermediate
LTRs, LINEs, SINEs
replicative (copy and paste transposition)
describe the structure and mechanism of action of LTR retrotransposons
pol and gag gene (sometimes env gene)
surrounded by long terminal repeats
surrounded by direct repeats (from target site integration)
pol encodes reverse transcriptase, RNaseH and integrase
mechanism:
pol is transcribed and translated and transcribes the LTR retroTn into RNA
RT: RNA –> cDNA –> dsDNA
RNase H degrades RNA template
dsDNA goes into nucleus guided by integrase bound to the LTRs
insertion of dsDNA into genome
name 2 types of LTR retroTn
mammals: ERV (endogenous retrovirus)
often only the LTRs are left due to homologous recombination
yeast: Ty elements
name 2 non-LTR retroTns
LINEs and SINEs
describe LINE structure and mechanism
structure: 2 ORFS (ORF1: RNA-BP), (ORF2: RT and DNA endonuclease) surrounded by A/T-rich region and direct repeats
mechanism:
Tn transc and transl by host machinery
ORF1 protein binds LINE RNA
ORF2 protein binds LINE polyA in cytoplasm
RNA transported into nucleus
ORF2/polyA RNA binds to complementary DNA (polyT) sequence somewhere in genome
endonuclease activity from ORF2 nicks DNA
ORF2 RT activity primed by host DNA sequences
(often RT doesnt reach end - Tn is truncated)
ORF2 continues synthesis now using host DNA as a template
2nd strand of DNA is made by host enzymes
describe the structure and mechanism of SINEs
structure: AT-rich sequences
mechanism: compete with LINEs for ORF1 and ORF2 proteins
name a common SINE in primates and explain its structure
Alu element
contains Alu restriction site
structure: Alu element duplicated - in right half there is an insertion/ next to right half there is a polyA tail
are humans mainly retrotransposon or dna Tn?
retrotransposon
similar to yeast
name 2 methods of exon shuffling by Tn presence
1) crossing over between transposons - as Tns are homologous CO can occur, swapping around parts of genes (this may result in genes having added functions if exons are added in)
2) mistakes in transposition - if 2 Tns surround a bit of DNA then instead of 2 transposition events occuring, 1 may occur, using the outermost inverted repeats of each Tn. this would result in the entire bit of DNA and the 2 Tns being transposed into another gene
another example of transposition mistakes is LINE Tns using 3’ polyA signals of genes instead of its own, therefore adding the end of the gene to the end of the Tn
what are the 3 main mechanisms for gene duplication?
replication slippage
unequal crossing over
(these 2 covered in prev. flashcards about satellites)
retrotransposition of mRNA
what happens when a gene is duplicated and when non-transcribed spacer seqs are duplicated?
genes are usually well conserved and continue to be transc and transl
NTS seqs tend to be quite divergent
what is the difference between orthologues and paralogues? give an example of each
orthologues: evolved by speciation (ie same gene in two different species) - human and mouse alpha-tubulin - evolving separately since the divergence of humans and mice
paralogues: evolved by gene duplication - mouse alpha tubulin and mouse beta tubulin - gene duplication event with differences accumulating between the 2 version
what are the 3 fates of a duplicated gene?
accumulation of mutations and degradation into a pseudogene (degradation)
one gene copy gains a new function (neofunctionalism)
each copy specialises (subfunctionalization)
describe the difference in a conventional and processed pseudogene
conventional: one gene copy has accumulated mutations that stop it from functioning - 2nd copy of gene (non-mutated) means that the 1st copy doesnt have a selection pressure applied to it - carries on getting mutations
processed: generated by RT of LINE (cDNA reintegrates into genome) these pseudogenes generally dont have processing signals so are generally non-functional
describe the evolutionary history of the globin gene family
haemoglobin consists of 2 alpha and 2 beta chains - beta chains generated by unequal crossing over between 2 Tns (paralogues)
different types of haemoglobin throughout devel. - foetal haemoglobin generated by gene duplication event and then subfunctionalization
how can 3n embryos be generated?
2n gametes generated via nondisjunction
2n + 1n = 3n embryo
(same can happen forming a 4n embryo)
describe the difference in autotetraploidy and allotetraploidy
autotetrapoloidy is the generation of a 4n embryo with 2n gametes from the same species
allotetraploidy is the generation of a 4n embryo with 2n gametes from different species
how and why does diploidisation occur?
diploidisation is the return to a diploid state from a polyploid state
this happens because the presence of 2 copies of a gene results in a less rigorous selection pressure (ie allows for specialisation and divergence of the 2 copies)
over time lots of duplicated material is lost by mutation or deletion
describe the yeast point centromere
3 regions: I and III highly conserved, II = v a/t rich region
only 120bp needed to directed microtubule attachment and mitotic segregation
describe the human ‘regional’ centromere
alphoid satellite dna: AT rich sequences, each repeat = 171bp
these 171bp monomeric repeats form high-order structures (lots of 171bp monomers) - these structures have slightly divergent sequences
these higher order structures then form tandem repeats - present in centromeres of all human chromosomes
what histone replaces H3 at the centromere? why?
CENP-A replaces H3 at the eukaryotic centromere. marks the nucleosome as ‘different’ and dictates kinetochore binding
describe the histones and epigenetic markers in centromeres and pericentric regions in humans
H2A.Z instead of H2
H3 is methylated (H3K4Me2)/CENP-A
around the centromere there is pericentric heterochromatin - specific methylation of histones at these nucleosomes
what are holocentric chromosomes? in which organism are they found?
holocentric chromosomes have holocentric kinetochores ie the attachment of kinetochores occurs at the entire length of the chromosome
cenH3 histones across chromosome length in c. elegans allow this to happen
name a similarity and a difference in the origins of replication in e. coli and humans
e. coli has one origin of replication whereas humans have 10,000’s
both form bidirectional replication forks
describe (v briefly) the early stages of euk replication
binding of origin of replication complex ORC to origin of replication
assembly of prereplication complex - MCM proteins and CDC6 and CDT1
cascade of other proteins until initiation of replication
describe the autonomous replication sequences in S. cerevisiae and S. pombe
S. cerevisiae:
250-400 origins of replication
only some of the 100’s of ARS consensus sequences actually initiate replication - essential but not sufficient for origin activity
transcriptionally silent areas are more likely to be bound by replication proteins
3 regions:
B3, B2 and (B1 and A)
B1 and A are the origin recognition sequence (where origin recognition complex binds)
S. pombe
AT-rich intergenic regions - at least half of intergenic regions have the capacity to serve as origins of replication (this is because there only 2H-bonds in between A and T and therefore they are easier to break)
how can you find simple origins of replication?
eg ARS
isolate plasmid with a selectable marker, clone into yeast - grow on the selective conditions - should be no growth as no origin of replication
then fragment yeast genome and clone pieces into the plasmid, insert plasmid into yeast, grow on selective conditions - ones that have grown have had origins of replication inserted into the plasmid
how are origins of replication found in higher eukaryotes?
use a thymidine analogue, bromodeoyxurdine, (which can be immunoprecipitated) instead of thymidine on a medium.
near origins of replication there will be nascent strand synthesis - BrdU incorporated and can be pulled down then identified by microarray or high-throughput sequencing
name and explain the four categories of features found in animal replication origins. what is the significance of the presence of these structures?
sequence: AT rich/CpG islands
structure: DNA topology/loop MAR (matrix attachment region)
chromatin: nucleosome/DNase-I sensitive site (open chromatin important)
transcription; promoter, enhancer, insulator, start site features
not all of these features are always present therefore does combination determine use of different origins in different conditions? or are there changes throughout development?
describe constitutive, flexible and inactive origins
constitutive: always used - minority of origins like this
flexible: used sometimes - stochastically (random)
inactive: never used under normal conditions - can be used in stressful situations eg DNA damage
do origins or replication fire at the same time or at different times?
origins fire at different stages of s phase
not really sure why this happens
why are telomeres required?
lagging strand of DNA replication requires an RNA primer which is later removed
gap filled in from the adjacent Okazaki fragment
this isnt possible at the end of a linear chromosome
telomeres are repeat, G-rich sequences with a 3’ overhang
give the sequence of mammal telomeres and approx how many repeats there are
TTAGGG
20-25kb repeats
what is a shelterin complex? why do chromosomes require a shelterin complex?
shelterin complex is a number of specialised proteins that bind telomere DNA and each other including TRF1, TRF2 and POT1
these form a cap on the telomere to differentiate it from DNA breaks, promote formation of special tertiary structure called t-loop, recruit telomerase and protect DNA from nucleases. all this prevents the ends of chromosomes being ‘repaired’ and NHEJ occuring
describe the mechanism of a mitotic clock
telomeres shorten as the organism gets older, at a certain length the telomeres are too short to bind shelterin causing cell cycle arrest, senescence, apoptosis and genome instability - this limits the number of mitotic cycles
how are telomeres important in aging and disease?
oxidative stress can speed up telomere shortening - correlates with ageing related disease
overexpression of telomerase in mice may prevent ageing
elevated telomerase activity in cancer cells - makes them essentially immortal
state which structures out of telomeres, centromeres and origins of replication are required for a linear and a circular chromosomes to replicate
linear: requires telomeres, origins of replication and centromeres
circular: requires origins of replication and centromeres
how does the telomerase protein increase telomere length?
telomerase consists of a protein component (TERT) and an RNA
component (TERC)
TERC base pairs with 3’ overhang
elongation of the overhang using RNA as template
telomerase translocates further out along the 3’ overhand (using RNA template)
this allows furhter elongation of 3’ overhang
RNA is removed, synthesis of 2nd strand of DNA by DNA pol using overhand as template
what bases and structures are generally present in gene-poor regions?
AT-rich
contain lots of repetitive sequences
closed chromatin structure
describe the process of G-banding
treat metaphase chromosomes with trypsin to remove proteins
stain with Giemsa or modern alternative which stains the AT rich regions of genome (ie transcriptionally inactive/silent regions)