L1, Genome Organisation Flashcards
Human genome vs Mitochondrial Genome: Details
Human…
- 3 millon bps
- 23 pairs of linear chromosomes
Mit…
- 16,569 bps
- Circular DNA
Prokaryotes: Number of protein coding genes examples
- Mycoplasma genitalium (not free-living): 480
- E.coli (free-living): 4000
- S. cerveisiae (Brewer’s yeast): 6000
Eukaroytes: Number of protein coding genes, examples
- Often a lot of redundancy in mammals
- Arabidopsis Thaliana: 15000
- Fruit flies: 13000
- Mice: 23000
- Human: 20000 (approx. 1% of human genome)
C-value paradox
- ‘Lack of correlation between biological complexity and the intuitively expected protein-coding genomic information or DNA content’
- DNA-complement
- Proportion of junk DNA found to be higher in salamander than human
- In salamander, total DNA is around 5x greater than humans
DNA Melt-Reassociation aka Reassociation Kinetics
- Techinque for establishing broad types of DNA
- Able to separate into highly repeated, moderately repeated and unique fragments
- Measuring how much ssDNA remains and how much dsDNA has formed at given times
- More repetition = more rapid reassociation, easier to find a match
Demonstrate the cot curve for DNA melt-reassociation, comment
- See slide 10
- Fraction reassociated against cot (initial concentration x time for reassociation)
Current Understanding: Broad classes of DNA sequences
- Single copy
- Gene families
- Tandem Gene Arrays
- Intermediate repeats (mostly transposable elements)
- Simple sequence repeat DNA
Single copy DNA: % of genome and exon content
- Makes up about 25% of genome
- Only 1% contained in exons
- Average gene 27kb with 9 exons
Functions of non-coding DNA
- Majority can be transcribed
- 22,219 non-coding genes
- Structural RNAs -rRNAs, tRNAs, snRNAs
- miRNAs - involved in gene regulation
- lncRNA: Target regulatory proteins, disease markers, possible causative agents in disease (e.g. BACE1)
Human Gene Families: What are they? Give 6 examples with no. members
Similar sequences:
- alpha-globins (4)
- beta-globins (5)
- actin (15)
- keratin type I (19)
- beta-tubulin (19)
- alpha-tubulin (10)
What is a pseudogene?
Inactive copy within a cluster
TAGs: What are they, proportion of genomes
- Gene clusters created by tandem duplications
- One gene is duplicated, the copy is next to the original
- Can encode large numbers of genes at a time
- 14-17% of the human, mouse and rat coding genomes
-> faster transcription
TAGs in the human embryo: Why are they particularly useful?
- Human embryo has 5-10 million ribosomes
- Embryonic cell number doubles within 24 hrs; single RNA gene may not be sufficient for RNA demands but tandem repeats of rRNA encoding genes allow a higher output (needs multiple RNA pols transcribing simultaneously)
Transposable elements in the human genome: Proportion, MEs, LINEs
Class length, acronyms
- IM class length (see: Melt Curve Study)
- Make up around 30% of the human genome
- ME: Mobile element
- LINE: Long interspersed nuclear element
Outline the two key types of transposable element in eukaryotes (with examples)
By transposition route
Retrotransposons
Transpose via and RNA intermediate;
- Viral (retrovirus like e.g. Endogenous retroviruses or LINE-like e.g. LINE1, LINE2)
- Non-viral (e.g. SINEs, processed pseudogenes)
DNA-DNA transposable elements
Transpose directly from DNA to DNA. Similar to bacterial transposons
- Non active in human genome
Importance of eukaryotic transposable elements..
- Play an important role in genome evolution (see reading)
- Source of regulatory elements, sites of recombination
- Insertions can cause disease
Life-cycle of retroviruses:
- Enter host
- Randomly integrated by integrase
- Provirus transcribed by host machinery
- Co-proteins and more integrase thus produced
- Assembly of full genome of retrovirus progeny
Components of viral retrotransposons with functions:
- Gag (Group antigens): Binding to RNA
- Pol: Reverse transcriptase
- Env: Envelope protein
- Int
What is HERV?
- Human Endogenous Retrovirus
- Generally, highly defective genomes
- See reading
LINE-1 (L1) element:
How many copies, length, extra features
- > 500 000 copies in human genome
- 1-6 kb in length
- Only 40-50 are active
- 2 open reading frames (ORF1, ORF2)
- No long terminal repeats (LTRs)
Timing and tissue specificity of L1 transposition
- Mostly repressed by methylation
- In tumours, demethylation increases transposition
- Many unique insertions take place in germ cells
- Also: Early embryos, neural progenitors during childhood -> negative impact on brain function if activity and therefore mobility are high in childhood
Characteristics of Non-viral elements (types and features)
SINEs; Short interspersed nuclear elements (13% of genome):
- Genomic copies of small RNAs
- Most belong to Alu family (7SL RNA) - identifiable feature of human genome
- Also copies of snRNAs and rRNAs
Processed pseudogenes: - Genomic copies of mRNAs
Alu sequences: (stats, comparisons)
- 150-300bp
- 1 million copies, 10% of human genome
- Occur approx. every 6kb
- Transcribed to give RNA
- Site of recombination (hotspots) can differ (particularly between humans and chimpanzees)
- Insertions have caused inherited disease
What are SVAs?
SINE-VNTR-Alu:
- Non-autonomous hominid specific retrotransposons
- Don’t exist in old world monkeys
- Several subtypes
- Can be transcribed
- Mobilise by LINE L1 retrotransposase
- Associated with disease in humans (see examples)
STR stats:
- 5% of genome
- Repeat unit length of 1-6 bp
- Total array length up to 100bp
- Length variations can affect gene expression in some hereditary diseases (e.g. HD, autism, schizophrenia)
Minisatellite/microsatellite DNA lengths/stats, uses
- Repeat unit length 15-100 bp (mini) and 2-5 bp (micro)
- Total array length 0.5-30kbp (mini), 60-200b (micro)
- Array length is variable - VNTRs or STRs
- Can be used in paternity and forensic analysis and in gene mapping
+ Transposable elements make up …. of human DNA
Around 30%
+ SINEs and LINEs make up … and …. of the human DNA
- 13% (SINEs)
- 21% (LINEs)
+ HERVs are:
- Human endogenous retroviruses
- About 8% of human genome