Eukaryotic genome structure. Flashcards
What is the ‘C value’?
Amount of DNA in a haploid nucleus for a given species- basically the genome size.
What is the ‘genome paradox’?
Complexity of a organism does not correspond to the genome size.
What is an example of a group of organisms that have a very varied genome size?
Flowering plants.
What fold difference is there between the sizes of insects genomes?
100.
What are the difference in genome sizes not down to?
The protein coding sequencing.
How large is the human genome?
3200Mb.
The human genome is 3200Mb, how much of this is made of exons?
48Mb.
How much of the human genome is made of ‘genes and related sequences’?
1200Mb.
How much of the human genome is made of ‘Intergenic DNA?’
2000Mb.
What is meant by gene density?
The average number of genes per Mb.
What organism does this data set correspond to?
Gene density: 96
Av introns per gene: 0.04
Amount of genome taken up by genome wide repeats: 3.4%
Yeast.
What organism does this data set correspond to?
Gene density: 76
Av introns per gene: 3
Amount of genome taken up by genome wide repeats: 12%
Fruit fly.
What organism does this data set correspond to?
Gene density: 11
Av introns per gene: 9
Amount of genome taken up by genome wide repeats: 44%
Humans.
The human genome is _____ densely packed than lower eukaryotes such as yeast.
Less.
What is the median intron length in humans?
3.3kb - this is big.
Why can humans accommodate for longer introns in their genome?
As spend longer in cell division meaning there is less selection pressure to get rid of large introns.
How much of the human genome is made up of satellite DNA?
6%.
Satellite DNA is made up of tandem repeats of ______.
1-500bp.
How long and mircosatellites?
1-13bp HOWEVEVER most are between 1-4.
How many repeats are microsatellites normally made from?
Less than 150.
How long are minisatellites?
14-100bp.
How many repeats are minisatellites made from?
1-5kb tandem arrays spread around the genome.
How long are satellites?
100-500bp.
Where are satellite DNA repeats especially important?
Mammalian centromeres.
What gene has a length of 1.4 Kb and 2 introns which make up 69% of the gene?
Insulin.
What gene has a length go 2400 Kb and 78 introns which make up 98% of the gene?
Dystrophin.
Each individual has a _______ set of satellite DNA.
Different.
Pol can often attach at the wrong place, what can this result in?
Pol going backwards, copying the same thing twice. Can also miss out a section if moves forward.
What is the consequence of pol slippage?
Lengthening or shortening of the daughter chromosome.
What can cause an increase in the number of repeats in the gametes?
Crossing over between misaligned repeats.
Techniques have recently been developed so you can tell where you are in an array. True or false?
False, it is impossible to tell.
What type of DNA is normally phenotypically neutral as it is found in-between genes?
Satellite DNA.
What is repeated in Huntingtons?
CAG.
What happens to proteins with expanded CAG repeats?
They are degraded into toxic fragments which accumulate in neurones and stop them working properly.
What age does Huntington’s normally kick in?
Around the age of 40.
What replication error can cause Huntingtons?
Slippage.
What repeat count of CAG doesn’t cause Huntingtons and results in no risk to offspring?
What repeat count of CAG doesn’t cause Huntingtons but elevates the risk of Huntingtons in offspring to
27-35.
What repeat count of CAG can result in Huntingtons and causes the offspring to have a 50% risk of getting the disorder?
36-39.
What repeat count of CAG will result in Huntigtons and a risk count of 50% in offspring?
40+.
What does DNA fingerprinting involve?
Polymorphisms in the microsatellite DNA length in individuals.
Who developed DNA fingerprinting in the 80’s?
Alec Jeffery’s.
What are three roles of DNA fingerprinting?
- Paternity testing.
- Forensics.
- Genetic mapping.
What does RFLEP, involved in DNA fingerprinting stand for?
Restriction fragment length polymorphism.
What are the 6 steps of restriction fragment length polymorphism?
- Extract DNA.
- Digest with a convenient restriction enzyme.
- Separate fragments on agarose gels.
- Southern Blot using microsatellite sequences as a probe.
- Observe characteristic bands.
- Do this for a number of micro satellite sequences.
What is used nowadays instead of DNA fingerprinting?
PCR with AFLP.
On the DNA fingerprint southern blot what will also be present on the southern blot?
Bands from other parts of the genome.
What does AFLP stand for?
Amplified Fragment Lengh Polymorphism.
Describe the process of AFLP.
PCR uses primers annealing to conserved sequences on either side of the microsatellite tandem array. These PCR products can then be visualised on an agarose gel.
Each individual has different repeat tracks in their Satellite DNA. Are these tracks still in the same place in the genome?
Yes.
What is the definition of a transposon?
A DNA sequence that can change its position within a genome, sometimes creating or reversing a mutation and altering the cells genome size. Also known as jumping genes.
What method is used by DNA transposons?
Cut and paste.
What method is used by RNA transposons?
Copy and paste.
Are direct or inverted repeats on the outside of a DNA transposon?
Inverted repeats.
What type of repeat found in a DNA transposon is generated from the host during genome transcription?
Direct repeats.
What type of repeat found in a DNA transposon is part of the transposon and is known as the transposase recognition site?
Inverted.
Transposase is transcribed and translated by the host machinery. Where could the gene have come from?
That particular transposon or one from another transposon elsewhere in the genome.
Where does transposase bind to on the transposon?
The inverted repeats.
What does the transposase do?
Cuts the DNA to remove the transposon and creates staggered cuts at the target site allowing the transposon to be inserted into the new location.
Transposons insert at random. True or false?
False, some do have a preference for a target site.
What is generated at the transposon target site?
Direct repeats, these remain once the transposon has moved again.
What creates he direct repeats at the transposon insertion site?
Host DNA repair enzymes.
How long are the direct repeats generated when the transposon moves?
5bp.
What increases the copy number of transposons?
Transposition at S phase.
The number of transposons builds up over evolutionary time. How many transposons are in the human genome currently?
300,000.
What percentage of the human genome is made up of transposons?
3%.
Where are Ac / Ds transposons found?
Maize.
Are Ac or Ds transposons autonomous?
Ac.
What is the same in Ac and Ds transposons found in maize?
The inverted repeats they contain.
The Ds transposon is nonautonomous. Where does it get its transposase from?
Ac.
Are long terminal repeats linked to Retrosposons or DNA transposons?
Retrotransposons.
Retrotransposons contain LTR and target site direct repeats. Which are found on the outside?
LTRs.
How many base pairs are LTRs made of?
200-600bp.
When are the target site direct repeats, found in DNA transposons generated?
Upon integration.
What two genes make up a retroposon?
gag and pol.
What three proteins are encode for by pol?
- Reverse transcriptase.
- RNase H.
- Intergrase.
Reverse transcriptase, RNase H and Intergrase are encoded for by Pol found in reterosposons. What are these genes all important for?
Reverse transcription.
What is the role of RNase H in reverse transcription?
Degrades the RNA template.
How does a reterosposon integrate?
- Generation of a RNA molecule using hot machinery.
- Reverse transcription using the reterosposon to generate a double stranded DNA molecule.
- Intergrase bound to the LTR allows transport into the nucleus.
- Insertion into the genome, also causing direct repeats to be created.
What are LTR retrosposon closely related to?
RNA viruses.
What RNA virus gene has been lost and what other gene will probably be lost in LTR retrosposon?
Env has already been lost and gag will probably be too.
What is an example of a LTR reterosposon that is found in mammals?
ERV - Endogenous retrovirus.
What is an example of LTR reterosposon found in yeast?
TY elements.
How many copies of the LTR reterosposon are found in the haploid yeast genome?
35.
How much of the human genome does the LTR reterosposon make up?
8%.
What can happen to the ERV retrosposon found in mammals?
All of the transposon apart from the LTRs can be lost though homologues recombination between two LTRs.
What are Non LTR retrotransposons also called?
Long Interspaced Elements (LINEs)
How long LINES?
6kb.
Do LINES or LTRS contain the pol and gag genes?
LTRs.
Do LINES or LTRS contain the ORF1 and ORF2 genes?
LINEs.
What are LINEs, SINEs and LTRs example of?
Reterosposons.
What is found between the target site and the protein coding genes in LINEs?
AT rich regions.
What does ORF1 encode in LINEs?
RNA binding protein.
What does ORF2 encode in LINEs
RT and DNA endonuclease.
Lines L1, L2 and L3 are all found in the human genome. Which is still functional?
L1.
What percentage of the human genome are made of LINEs?
21%.
What is the mechanism for LINEs?
- Transposon is transcribed and polyadenyalted.
- ORF1 protein binds LINE RNA while the ORF2 protein binds RNA poly(A). This occurs in the cytoplasm.
- RNA is transported into the nucleus.
- ORF2/Poly(A) binds to complementary poly(T) sequence in the genome.
- Endonuclease activity of ORF2 nicks both strands.
- ORF2 RT activity is primed by host DNA sequence.
What transposon needs to be polyadenylated?
LINEs.
Full length LINEs are theoretically 6kb, however they are often only 900bp in length. Why is this?
RT often doesn’t reach the end of the transcript.
Are direct repeats generated with LINEs?
Yes.
What are two examples of Non LTR-reterotransposons?
SINEs and LINEs.
What does SINEs stand for?
Short interspaced elements.
Are LINEs or SINEs nonautonoumous and require the transposonse from the other?
SINEs.
What do SINEs contain which allows them to bind to ORF1 and ORF2?
AT rich sequences.
What transposable element is common in primate genomes?
Alu.
What is the most common transposable element found in the human genome with more than a million copies?
Alu.
What other transposable element, apart from LINEs, are often truncuated?
Alu elements.
How long is the consensus sequence for Alu elements?
282bp.
What are the transposable elements Alu names after?
The Alum restriction site they contain.
Transposition is not that common in the human genome due to cellular defence mechanisms. How many cells does it occur in?
1 in every 1000 generations.
When will a transposition be fixed?
When it occurs in the germline, in this case 1/4 of the gametes will have it.
What genome contains 100% reterosposons?
S. cervisae.
What percentage of transposons found in the human genome are reterosposons?
90%.
What percentage of transposons found in the human genome are reterosposons?
95%.
What percentage of transposons found in the C. elegant genome are reterosposons?
15%.
What causes exon shuffling?
Crossing over between different transposons in different parts of the genome that certain exon are flanked between, these exons will be swapped due to recombination of the transposon sequence.
Exon shuffling is always nonsense. True or false?
False, it can occasionally provide a new function which can be selected for.
How can transposons result in additional exons being added to a transcript?
A LINE transposon may use a Poly(A) signal from a neighbouring gene.
What are the there main mechanisms which cause gene duplications?
- Replication slippage.
- Unequal crossing over.
- Reterotranspositon of an mRNA.
Are gene duplications common?
Yes, less than half of the genes in the human genome are solitary genes.
What is an example of a duplicated gene and what caused it?
Crossing over between L1 repeats in a goblin gene cluster.
What is an example of a gene which has been amplified into tandem repeats?
rRNA?
Why has the tandem repeat of the rRNA gene been conserved?
Because high amounts of rRNA need to be transcribed.
What is found between the tandem array of rRNA genes?
Non transcribed spacer sequences (NTS).
Are the NTS or the rRNA sequences quite divergent?
Non transcirbed spacer sequences (NTS), the rRNA gene is quite well conserved.
What is the definition of a pseudogene?
A gene that has lost the ability to code for a functional protein.
What is an example of a mutation that can cause a pseudogene?
Frameshfits and point mutations that can generate early stop codons and cause incorrect splicing.
Why even when a gene is duplicated do you normally only get one functional gene copy?
As there is no selection pressure on the second gene meaning mutations are not repaired.
For a eukaryotic gene to be functional what does it need?
All introns.
What is a processed pseudogene?
A pseudogene generated by the RT of a functional mRNA, creating a cDNA that can be inserted into the genome by LINE proteins. These inserted sequences do not have the proper signals so are generally not functional.
What two types of homologous genes are there?
Orthologous and paralogous.
What is an orthologous gene?
A homologous gene that has evolved by SPECIATION in two species separately since the early divergence of the two species.
What is a parlogous gene?
A homologous gene that has evolved by DUPLICATION in a single species. The gene would have originally duplicated and then changes would have occurred to both.
What type of homologous gene is tubulin and tubulin 2?
Orthologous.
What type of homologous gene is alpha tubulin and beta tubulin?
Paralogous.
What are the three fates of duplicated genes?
- Accumulation of mutations to make a pseudogene?
- Neofunctionalizaiton.
- Subfunctionalizaiton.
Neofunctionalizaiton can happen to duplicated genes. What is this?
Mutations leading to slightly different gene functions.
Subfucntionalization can happen to duplicated genes. What is this?
Mutations that lead to spatial partitioning with each copy being expressed in different conditons.
How was the global gene family generated?
Various gene duplication events.
What two forms of goblins diverged but are still recognisably homologous?
Alpha and beta.
What is the more distant relative go the globin family which is now only found in muscles?
Myoglobin.
When did alpha and beta global diverge?
500 mya.
What chromosome is myoglobin found on?
22
What chromosome is alpha globlin found on?
16.
What chromosome is beta globlin found on?
11.
What were the three steps in the evolution of the globlin gene family?
- Unequal crossing over between 2 transposons.
- Chromosome with two B globlins passed on in the germline.
- Two copies evolved independently to generate paralogous.
Different globlin forms are found during different stages of development. What form is found in embryonic Hb?
Elipson.
Different globlin forms are found during different stages of development. What form is found in fatal Hb?
Gamma.
Different globlin forms are found during different stages of development. What form is found in adult Hb?
Beta.
What subunit is consistent in all Hb?
Alpha.
Why can oxygen pass from maternal blood to foetal blood?
Foetal haemoglobin has a higher affinity to oxygen than adult haemoglobin.
What type of globlin has the highest affinity to oxygen?
Myoglobin.
How can 2n gametes be created?
Meiotic nordisjuncton.
In what type of organism are tetraploids viable?
Plants.
What is a Autotetraploid?
Two gametes from the same species (parent), i.e. four maternal chromosomes.
What is a Allotetraploid?
Two gametes from different species (species) i.e. the full parental and maternal set.
What does dipolidisation occur on?
Autotetraploids and allotetraploids.
What is diploidisation?
The loss of duplicated material by mutation or deletion over evolutionary time.
What do Autotetraploids and allotetraploids form before they form diploids in dipolidisation?
Partially diploidised tetraploids.
How many times is it thought that a whole genome duplication followed by diploidzation has happened in the evolution of vertebrate animals?
Twice.
What is a advantage of polyploid species?
Allows divergence as there is still an intact copy of the gene.
What is a disadvantage of a polyploid species?
Increases the likelihood of mitosis and meiosis errors.
Organisms with _____ n tend to be more stable?
Even.
Apart fro plants, what other species can be tetraploid?
Some frogs.
What value is n in wheat?
6, it is a hexaploid.
What value is n in some plants, including bananas?
3, it is a tetraploid.
What type of genes are an example of a genome duplication followed by the evolution of the duplicated genes?
Hox genes.
What are Hox genes?
Transcription factors that determine the anterior and posterior axis of animal development.
How many copies of hox genes are there in mammals compared to ancestral genes and why is it thought that this is the case?
4, due to two ancestral genome duplications.
What is the definition of a centromere?
Specialised chromosomal region upon which kinetochores assemble and direct equal segregation in mitosis and meiosis.
Do centromeres need to be in the centre of the chromosome?
No.
Is the chromosome structure or sequence conserved?
Structure.
What occurs at the centromere?
Specialised nucleosides bind to heterochromatin, kinetochore then binds allowing the recruitment of microtubules. This allows segregation in mitosis and meiosis.
How many bp make up the yeast point centromere?
120.
What regions make up the yeast point centromere?
I, II, III.
What regions in the yeast point chromosome are very well conserved?
I and III.
What region in the yeast point chromosome is very AT rich?
II.
How many base pairs are needed to direct microtubule mitotic segregation and attachment?
120.
What region of the yeast point chromosome varies in size?
II, on cen 3,6 and 11 it is 84bp while on cen4 it is only 78 bp.
What type of DNA is the human ‘regional’ centromere made from?
Alphoid satellite DNA.
How long is the human ‘regional’ centromere?
171bp.
Where is alphoid satellite DNA found?
In tandem arrays at the centromeres of all human chromosomes.
What type of structure can the human regional centromere be described as?
A higher order structure made of several repeats of slightly divergent sequences.
What is the higher order repeat that makes up the human regional centromere?
- 171bp monomers.
- Higher order repeat 1-3kb.
- Homogenous higher alpha satellite array 200-5000kb.
How many histone proteins is a standard nucleosome made from?
8, with 2 H3.
What replaces the histone protein H3 at the S.cervisiae centromere to mark the histone as different?
CENP-A.
What is the role of CENP-A at the yeast centromere?
It is recognised by the kinetochore microtubules to direct kinetochore binding.
What replaces the histone protein H2 at the human centromere?
H2A.7.
What modification occurs to the histone protein H3 at the human centromere?
H3K4me2.
What type of chromatin is found at the human centromere?
Pericentric heterochromatin. Makes up a large part of the chromosome.
What are the two main roles of the kinetochore?
- Recognises centromeric epigenetic markers.
2. Allows centromeres to attach to the microtubules for segregation in mitosis.
How many proteins make up the kinetochore complex?
Dozens.
What is the kinetochore complex called in budding yeast?
Dam1.
What is the kinetochore complex called in humans?
Ndc80 complex.
What is the Ncd80 complex and what two components make it up?
It is the human kinetochore complex. It is made from the ska complex and Cdt1.
Name one example of an organism with holocentric chromosomes?
C.elegans.
What is meant by the term ‘holocentric chromosome’?
Attachment sites for the kinetochore complex along the whole chromosome.
What histones are distributed throughout the whole of C. elegans chromosome allowing attachment to the kinetochore?
cenH3.
Is the sequence of the centromere conserved thought the eukaryotes?
No.
There is a single origin of replication in the E.coli genome. How many is there in a eukaryotic genome?
Half a million.
What can the origins of replication in both E.coli and eukaryotic genomes be described as?
Bidirectional.
What are the three early stages of eukaryotic replication?
- Binding of the origin or replication complex.
- Assembly of the pre initiation complex.
- Initiation of replication.
What MCM protein binds to the origin of replication complex first?
MCM9.
What MCM complex binds to the origin of replication complex secondly?
MCM 2-7.
What three proteins bind when to assembly the pre initiation complex at the origin of replication?
CDC6, CDT1, MCM9.
What is ORC bound to CDC6, MCM9, CDT1 and two MCM2-7 complexs called?
The pre replication complex.
What happens to the pre replication complex?
Additional factors bind making it the pre initiation complex. Pol(elipson) and Pol(alpha) then bind to form the initiation complex which can enter S phase.
What has to happen to the initiation complex for it to enter S phase?
DNA replication inhibitor geminin is converted to replication factor CDT1.
What are lower eukaryotic replication origins also called?
Autonomous replication sequences (ARS).
How many ARS are found in each round of replication in S.cervisiae?
250-400.
How long is the consensus sequences of the ARS in S. cervisiae?
11bp.
All ARS sequences initiate replication in S.c. True or false?
False, only some of the hundreds do.
To initiate replication in S.c all you need is ARS sequences. True or false?
False, they are essential but not sufficient to initiate replication.
What are transcriptionally silent areas in S.c more likely to be bound to?
Replication proteins.
What intergenic regions does S.pombe have?
AT rich ones.
How many of the intergenic regions in S.pombe have the capacity to serve as origins of replications?
At least half.
What two features are often found at animal replication origins regarding sequence?
AT rich sequences and CpG islands.
What two features are often found at animal replication origins regarding structure?
DNA topology and loop MAR.
What two features are often found at animal replication origins regarding chromatin?
Nucleosomes present and the DNaseI sensitive site.
What two features are often found at animal replication origins regarding transcription?
Promoter enhancer or insulator and start site level.
Are all 8 features seen at animal replication origins found at every replication origin in animals and why?
No, combinations of the features use can thus change in different conditions.
In what type of organism has no consensus sequence been found for the origin or replication?
Metazoans.
How many origins of replication are there per replication cycle in animals?
Tens of thousands.
Most animal origins are not used in any given cell cycle. true or false?
True, they will only be activated in specific conditions such as after DNA damage.
What are the three types of origins of replications found in animals?
- Flexible
- Constitutive
- Inacitve
When will inactive origins of replication be used in animals?
Under stress.
When will flexible origins of replication be used in animals?
Stochastically/ randomly.
What origin of replication in animals is the minority?
Constitutive.
What can occur at animal replication origins with reasons that are not clear?
Early and late firing, i.e. at different stages of S phase.
Why is the replication of the ends of linear chromosomes an issue?
The lagging strand requires an RNA primer which later removed. In circular chromosomes these gaps are filled in by okazaki fragments however this is not possible in linear chromosomes.
What do telomere regions at the end of chromosomes consist of?
A number of repeats and a 3’ overhang.
What repeat is found at the telomeres of tetrahymena?
TTGGGG- lots of early research was done on this organism as it has a lot of short chromosomes so many ends.
What repeat is found at the telomeres in mammals?
ATTAGGG.
The number or repeats found at the telomere varies. How many are there in tetrahymena?
A few.
The number or repeats found at the telomere varies. How many are there in humans?
10-150kb.
Why does the cell need a mechanism to differentiate between chromosome ends and double strand breaks?
As need to avoid sticking chromosomes together through the NHEJ DNA repair pathway.
What are three examples of telomere binding proteins in humans?
TRF1, TRF2 and RAP1.
What is the role of telomere binding proteins?
They form a cap on the telomere to differentiate it from a break.
What do telomere binding proteins promote the formation of?
A specialised tertiary structure called the T loop.
What do telomere binding proteins prevent the telomeres from?
Nucleases.
What can telomere binding proteins recruit?
Telomerase.
What two components make up telomerase?
TERT and TERC
What is the protein component of telomerase?
TERT.
What is the RNA component of telomerase?
TERC.
What is the role of TERC in telomerase?
Provides a template of the reverse transcriptase activity from TERT.
When is telomerase activated?
Once the telomeres fall below a certain threshold level.
Telomerase is active in all cells. True or false?
False.
What are the five main steps of telomerase?
- RNA component base pairs with the 3’ overhang.
- Elongation of the overhang using the RNA as a template.
- Telomerase translocates further down the 3’ overhang.
- Elongation of the overhang using the RNA as a template.
… - RNA is removed and polymerase synthesises the second strand using the overhang as template.
Why can single cell eukaryotes be immortal?
High telomerase activity.
What cells in humans have a high telomerase activity?
Stem and germline cells.
In what human cells is the telomerase activity undetectable?
Somatic cells.
What four things can occur when the telomeres fall below a certain level?
- Cell cycle arrest.
- Senescence (biological ageing).
- Apoptosis.
- Genome instability.
How many bp on average do the telomeres shorten every cell division?
30bp.
What type of disease are shorter telomeres correlate with?
Age related diseases.
Overexpresion of telomerase in mice has been shown to prevent ageing. True or false?
False, it has only be shown that this may be the case.
In what percentage of cancer cells is there elevated telomerase activity?
90%.
What experiment was carried out to determine the importance of telomeres, the centromere and the origin or replication?
An artificial chromosome containing a leu gene was placed into S. pombe which was grown on minimal media, meaning would only grow if there was proper segregation of the chromosomes.
What were the results in the S.pombe experiment determining the importance of the origin or replication, the centromeres and the telomeres for the circular chromosome?
- When only the Leu gene was present did not grow.
- When the Leu and origin 5-10%.
- When the Leu and origin and centromere >90%.
What were the results in the S.pombe experiment determining the importance of the origin or replication, the centromeres and the telomeres for the linear chromosome?
- When Leu and origin unstable.
- With Leu, telomeric repeats, origin of replication and centromere stable.
What does transcriptionally active and silent regions of the genome roughly correspond to?
Gene rich and gene poor areas.
How are transcriptionally active and silent regions characterised?
Differences in chromatin structure and histone modifications.
What areas of the genome does g banding stain?
AT rich regions.
What is the role of trypsin in G banding?
Removes proteins.
What stain, now often replaced with modern alternatives, can be used in G banding?
Giesma.
What technique does chromosome painting involve?
FISH.
What is chromosome painting specific to?
Sequences in each chromosome.
What does chromosome painting allow (3 points)?
- The identification of chromosomes
- Studying of their rearrangements which can be used as a diagnostic tool.
- Evolutionary studies.