L6 Positive selection and dN/dS Flashcards
dN/dS
- metric to measure evolution
- accounts for lineage effect and regional mutation bias
number of syn sites do not equal number of non-syn sites
2/3 changes would be non-syn according to the aa table
dN
changes per non-synonymous site
dS
changes per synonymous site
dN vs dS
- dN usually less than dS
- synonymous sites are evolving faster
- synonymous site rates show less variance because less heterogeneity in selective constraint
dN
purifying selection
dN = dS
neutrally evolving
dN/dS calculations
see OneNote
Paralogs
genes whose lineages diverged at a gene duplication event
Orthologs
genes whose lineages diverged at a speciation event
Molecular mechanisms of gene duplication
- unequal crossing over at meiosis
- retrotransposition: reverse transcription
- genome duplication
How can you tell if the duplication had arisen from retrotransposition or unequal cross over?
If retrotransposition:
- Presence of polyA tail
- Lack of introns
retrocopy usually considered a “processed pseudogene”
Ohno’s Model of Gene Duplication
- inactivating mutation e.g. non-sense mutation - decays over time
- beneficial mutation - new function, selected for, increases in frequency, becomes fixed, new gene born = “neofunctionalization”
Fates of duplicate genes
- pseudogene
2. new function
How do you tell whether the extra copy is a pseudogene?
Genes will accumulate mutations over time .g. frameshifts or non-sense mutations
If there is substantial divergence but no inactivation after some time after the duplication, can argue that the gene has a new function
In pseudogenes expect:
expect dN = dS
exon = intron
Jingwei gene
See OneNote
Phylogenetic branches
- if neofunctionalisation = longer branch as more aa changes for new function
- if subfunctionalisaiton = branches are of equal length, equally diverged
Subfunctionalization
- specialization
- complementary degenerative mutations
- degenerate mutations increase the maintenance of duplicates
How many protein coding genes are redundant?
- Genetic manipulation gives us some idea
- dN/dS also provides useful clues
Tandem arrays of whole genes
The copies perform the same function then any one copy is redundant
Tandem arrays of whole genes
Gene repeated over and over again as a lot of product is needed e.g. house keeping protein - ribosomal DNA locus
Concerted evolution
the sequences evolve together, copies of the ribosomal DNA evolve in the same way
Why so many copies?
- more of the product is produced
- the copies perform the same function and any one copy is probably redundant