Eukaryotic Genome Structure Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is c value paradox

A

• The genomes of eukaryotes are orders of magnitude larger than archea and bacteria
• Scientists suggested that this size range correlated with organismal ‘complexity’
• This would make sense as more genes would be required for this complexity
• However
• There exists variation in genome sizes within an organisms class
• E.g. approx 100 fold difference between the smallest and largest amphibian genome
• Even though they have same body plan and metabolism
• Gene numbers don’t scale with genome size
• E.g. yeast have many more genes than expected if comparing genome size to size of human genome
• The confusion was called the C-value paradox
• C-value = haploid DNA amount in the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What explains c value paradox

A

• Some organisms have large organelle genomes
• Some organisms have duplicated genomes (polyploidy)
• Non-coding DNA
• Less than 5% of human DNA contains the approx 25000 genes
• Amount of non-coding dna does increase dramatically with organism complexity (and can explain class differences)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is non coding dna

A

• Eukaryotes have more dna that does not code for protein or for any other functional product molecule than prokaryotes
• Approx 98% of human genome is non-coding as opposed to 11% of e.coli
• “Junk DNA”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do we think is the function of non coding dna

A

• Eukaryotes have evolved sophisticated gene regulation
• Correlates with biological examples
• Approx 9% of him sapiens genes encode transcription factors
• Approx 5% of drosophila and 3% of s.cerevisiae
• Other concepts including splicing and alternative splicing also explain increased complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Complexity of dna (C0t analysis)

A

• C is DNA conc, t is time taken to re nature
• Based on re-naturation of ssDNA
• Indicates type of unique and repetitive dna
• As dna cools complementary sequences find each other and base pair
• Since a sequence of ssDNA needs to find its complementary strand to reform a double helix, common and repetitive sequences re nature more rapidly than rare sequences
• The rate at which dna reanneals is a function of the species genome characteristics (size and complexity)
• The bigger the genome the longer it takes for two complementary sequences to meet
• Repetitive DNA will re nature at low C0t values
• Unique DNA re natures at high values
• Eukaryotic genomes have a range of sequences of different repitition levels
• 1) single copy (some functional genes)
• 2)Middle repetitive dna (100-5000bp) ,10^6 transposons
• 3) highly repetitive dna (up to 10bp, copies > 10^6) i.e. tandem repeats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What’s in a genome

A

• About 2% of eukaryotic dna is coding (encodes proteins)
• 25-50% of the protein-coding genes in eukaryotes are represented only once in the haploid genome
• But even they have non-coding dna associated with them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Pseudogenes

A

• Pseudogenes: once functional (can be again)
• Evolutionary relic
• Under some definitions non-coding RNAs (e.g. tRNAs etc.) are considered non-coding functional sequences (a very protein centric definition)
• Transcription factor binding sites such as enhancers and sequences are also non-coding but functional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Multigene families

A

• Groups of identical or very similar sequences
• Can be tandemly arrayed (head-to-tail fashion)
• Examples include the tRNA genes (at approx 50 sites, containing 10-100 genes), histones genes in some species

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Dispersed multigene family

A

• Some genes have not been tandemly repeated but have become dispersed at several locations in the genome through chromosomal re-arrangements
• They may have different functions
• The Aldolase gene family has 5 members
• They are located on chromosomes 3,9,10,16 and 17

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Transposons

A

• Some repeat sequences are transposable elements, which presumably have increased in copy number through transposition
• TE are found in all organisms and are trans-posed via a dna or RNA intermediate
• Some “retro transposons” resemble retroviruses but only move within a cell rather than between cells
• Genome wide repeats: several thousand repeats per element
• -LTR retroelements are important in some genomes (maize)
• (Degraded in others: endogenous retroviruses 4.7% of human genome)
• Not all types of RNA transposons have LTR elements
• In mammals the most important are LINES (long interspersed nuclear elements) and SINES (short interspersed nuclear elements)
• SINES: highest copy number in human genomes
• 1.7 million copies (14% of genome)
• LINES: less frequent but longer
• Approx 1 million copies (>20% of genome)
• DNA transposons are less common than retrotransposons
• The human genome contains 350000 copies but most are inactive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why do we need dna condensation

A

• All life on earth uses dna to store genetic information
• Genetic info is often much longer than the cell it fits into
• Contour length- the length of the dna assuming a B-form double helix
• Result – need a division of the genome (e.g. linear double-helical molecules:chromosomes)
• All eukaryotes have at least 2 chromosomes
• Variability in chromosome number is unrelated to organisms biological features and genome size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why do we need dna condensation in humans

A

• The 23 human chromosomes contain from 50 to 250 Mbp
• DNA molecules of this size are 1.7 to 8.4 cm long when uncoiled
• Typical human call contains 46 chromosomes equal to 6x10^9 Bp
• Cell nucleus has a diameter of 10-20 microm
• If chromosomes were not condensed it would be impossible to replicate and transcribe them correctly or segregate them to daughter cells

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the nucleosome

A

• DNA is wrapped around nucleosomes
Each nucleosome consists of a little less than 2 turns of dna wrapped around a set of 8 proteins called a Histone octamer
• The nucleosome is an octamer of histones
• H2A, H2B, H3 and H4 (102-135aa)
• Histones are highly conserved
• H4 from pea and cow only differ by 2 aa
• Histones proteins form a barrel-shaped core octamer
• H3.H4 dimer forms
• Tetramer forms
• Interacts with H2A.H2B dimer
• Octamer interacts with 146bp of dna
Histone proteins form a barrel shaped core octamer
Nucleosome core particle (NCP) is 11nm diameter, 6nm height
• Histones contact minor groove, leaving major groove available for gene regulating expression
• Histones 1 (linker histones) locks the complex with 20-90 bp in place
• The chromatosome is the dna + octamer complex
• Nucleosome distribution varies between organisms and chromosomal locus
• DNA binding is sequence dependent
• Octamer can migrate (aids polymerase access etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

• 30nm fibre/solenoid:

A

• The poly nucleosome is thought to be an infrequent structure
• The chromatin condenses by zig-zag folding (the solenoid)
• Histones H1 stabilises this structure
• Histones H2A-H2B dimer and H4
• Sequential NCPs rotated approx 71 degrees
• Extent of compaction depends on coupling DNA around the fibre
• Responds to cell environment (pH, DNA binding proteins, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

• Scaffold association:

A

• A 30nm fibre of a typical human chromosome would be 1mm long, more compaction needed
• 30nm fibre organised as looped domains
• Protein scaffold made of histone H1 and other proteins (Sc1 and Sc2)
• Scaffold attachment points (AT rich region)
• Radial arrangement of loops

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

• Condensed scaffold / giant supercoil:

A

As loops are fixed at the base, structure can generate coils and super coils

Chromosomal chromatids may consist of helically packed loops of 30nm fibres

17
Q

Types of interphase chromatin

A

• Euchromatin (open, transcriptionally active)
• Heterochromatin (condensed, less active)
• Faculative (can be changed to euchromatin)
• Constitutive (condensed throughout cell cycle)

18
Q

How can chromosomes be recognised

A

by the lengths of their chromatids, position of centromere and staining

19
Q

What are arabidopsis centromeres

A

• Arabidopsis centromeres span up to 1.2Mb
• Contain 180bp repeat sequences, genome wide repeats and some genes
• Function as attachment point for Kinetochores

20
Q

Telomeres

A

• Telomeres are the terminal region of the chromosome
• Enable the cells machinery to distinguish between real ends and a double stranded break
• Made up of a repeated motif (5’-TTAGGG-3’ in most eukaryotes)

21
Q

Explain the nature of Eukaryotic RNA Processing

A

• Although there are similarities (RNA polymerisation and polymerase structure) eukaryotes display more pre mRNA processing than prokaryotes

22
Q

RNA capping:

A

The RNA polymerase contains a c-terminal domain (CTD)
• When phosphorylated it recruits the capping enzyme complex
• Modifies the 5’ end to a 7-methylguanosine, joined by a 5’-5’-triphosphate bridge
• Reactions:
• Guanylyl transferase removes the gamma phosphate of 5’-nucleotide and beta and gamma phosphate of GTP
• New terminal guanosine converted to 7-methylguanosine by methyl group attached to nitrogen 7 of purine ring (by guanine methyltransferase with the help of S-adenosylmethionine)

23
Q

Polyadenylation:

A

• Most eukaryotic mRNAs have defined 3’ ends terminating in 250 adenosines
• These are added by Poly(A) polymerase and encoded not by sequence
• Polyadenylation is a part of transcription termination
• It may influence mRNA stability
• Complex with poly(A) binding protein (PABP) and prevents degradation

24
Q

Intron consensus sequences:

A

• Found in the GU-AG introns
• These sequences probably act as recognition for rna-binding proteins

25
Q

• Splicing can be divided into 2 steps:

A

• Transesterification reactions:
• In the first transesterification reaction, the ester bond between the 5’ phosphorus of the intron and the 3’ oxygen of exon 1 is exchanged for an ester bond with the 2’ oxygen of the branch -site adenosine
• Begins to form a lariat structure
• In the second reaction, the ester bond between the 5’ phosphorus of exon 2 and the 3’ oxygen of the intron is exchanged for an ester bond with the 3’ oxygen of exon 1
• The intron is released as a lariat structure and the two exons have been spliced

26
Q

Problems with intron removal

A

• Chemically, intron removal is not difficult but topologically it is
• There can be kilobases of distance between splice sites and all sites show similarity to one another how do you know where splicing should occur?

27
Q

• Splicosome:

A

• Spliceosomes carry out splicing
• SnRNAs and proteins +mRNA = snRNP
• (Small ribonucleoproteins)
• Form a series of complexes, including the spliceosome
• Complex, highly regulated process
• Site discrimination dependent on U1 and U2 RNA base pairing
• Intron sequences are paradoxical
• They are a source of vulnerability for the cel
A mutation in splice sites can lead to disease

28
Q

Alternative splicing

A

In the 1980s we saw some primary transcripts could be spliced in many ways (alternative splicing)
• The differentially spliced transcripts may lead to proteins with differing destinations in the cell, and different catalytic or interactive properties
• E.g. Dscam
• The final mRNA contains 24 exons, four of which (4,6,9,17) are arrays of alternative exons
• If all possible splicing combinations are used you could get 38,016 combinations

29
Q

Group I introns:

A

• Found in pre-rRNA
• 2 transesterifications
• 1st induced by free nucleoside/nucleotide (GTP)
• Attacks 5’ splice site
• G transferred to the 5’ end
• 2nd involves 3”-OH and causes cleavage
• Autocatalytic Ribozyme

30
Q

Group II introns:

A

• Found in organelle genomes
• They self splice in a test tube but have similar splicing mechanisms to GU-AG introns

31
Q

• Purpose of introns:

A

• Comparison between related genes in an organism or between same gene in different organism shows that intron sequences are poorly conserved
• Introns usually undergo rearrangements rather than point mutations caused by transposable elements
• Some introns are so big that they include complete genes
• Introns may also include regulatory sequences that control expression of that gene
• Most introns can probably be deleted without immediate major effect on that gene = no functional selection
• This makes introns useful playgrounds for genome evolution
• Enhance coding potential? Due to alternative splicing?

32
Q

Non functional repetitive sequences

A

• Non-functional repetitive sequences:
• Approx 65% of the human genome comprises intergenic regions
• Unknown function
• Yeast has about 30%
• Some repetitive sequences (thousands of repeats in tandem) are associated with heterochromatin (so non-transcriptionally active)
• Simple sequence DNA (aka micro satellites) -repeats <13bp (clusters <150bp)
• Scattered throughout genomes
• 3% of human genome
• Variable number tandem repeats (including mini satellites)
• Repeat units up to 25bp in length (clusters up to 20kb)