MT Host + Genomics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Intro to Genome Evolution

How do chromosomes evolve?

A
  1. more chromosomes or less chromosomes by fusions or fissions
  2. can vary in lengths by duplications
  3. acquire multiple sex chromosomes by translocations
  4. translocation, duplication, insertion, deletion can all effect shape of chromosome by interacting for a tighter/looser shape
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Intro to Genome Evolution

What does the human cell genome conatin, what is it split up to?

A

Human cell genome has a small mitochondrial genomne, and a large nulcear genome
1. nuclear genomes are either Transcribed or Not transcribes into RNA, those transcribes can either become protein coding genes (mRNA 1%), or Non-coding genes like tRNA, rRNA, snRNA, snoRNA, siRNA (3%)
2. The not-transcribe genome can be structural or non-structural
3. structural ones are either telomeres (1%), or centromeres (10%)
4. Non-structural ones are further split into Unique or Repetitive DNA
5. Unique contains: conserved non-coding elements, or non0conserved non-coding elements
6. repetitive ones are: satelite DNA, Retro, transposons, LINES, SINES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Intro to Genome Evolution

How come closely related species: like humans and chimps, and even humans (23) and drosophila (4) have such different numbers of chromosomes?

A

This is because chromosome evolution. Within chromosomes 90% of the genome is non-coding. Hence the difference in chromosome number is likely to increase in the ‘junk’ DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Intro to Genome Evolution

What are some questions which need to be answered about genome evolution?

A
  1. how do chromosomes evolve
  2. what is the origin of introns
  3. how do (what are the methods) genome size evolve
  4. where do new genes come from
  5. why is there so much JUNK DNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Intro to Genome Evolution

What ways do new chromosomes evolve?

A
  1. Chromosomes can evolve through chromosome fusion/fission, which lead to the reduction or increase in number of chromosomes (sex chromosomes + autosomal chromosomes)
  2. Chromosomes can evolve via translocation (evolution of 2nd pseudoautosomal regin in humans)
  3. Chromosomes can evolve via inversions and segmental duplications (chromosome shape evolution)
  4. chromosome can evolve via homologous recombinations in male meiosis (sex chromosomes)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Intro to Genome Evolution

What happens when chromosomes evolve via fusion/fission, what is 1 example?

A

When 2 chromosomes fuse together to become one, or when 1 chromosome splits into 2
* Example of fusion is: Evolution of chimpazee chromosome to human chromosome:
* The 2a & 2b chromosome in chimpanzees fused telomere-to-telomere to form human chromosome 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Intro to Genome Evolution

What happens when chromosomes evolve via translocation? What is an example in sex chromosomes?

A

Transolcation is a genetic change in which a piece of one chromosome breaks off and attaches to another chromosome. Sometimes pieces from two different chromosomes will trade places with each other.
1. The 2nd PAR (pseudoautosomal regions) of humans XY chromosomes arose due to a translocation from X to the Y chromosome in the human lineage after its split from chimpanzee lineage.
2. PAR region is the cross over/recombined ends of the X and Y in male meiosis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Intro to Genome Evolution

How can Chromosomes volve evia inversions and segmental duplications to have chromosome shape evolution? Does this affect gene expression?

A
  1. Gene duplication causes double amount fo genes (so longer chromosome)
  2. Inversion can cause silencing due to affecting expression via how coiled it is, and this can change chromatin structure
  3. It affects gene expression massively and can even lead to the creation of new genes/switching on-off genes be changing its epigenetics
  4. multiple inversions occured in humans. around 13.7% genome is segmentally duplicated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Intro to Genome Evolution

What is a good example of chromosome evolution via fusion in Muntjac Deer

A

Tandem chromosome fusions in karyotypic evolution of Muntjac Deers
* Karyotype: the visible number of and appearances of chromosomes in the cell nuclei of a cell of a species
* Muntiacus gongshanensis (M.gongshanensis) has lowest chromosome number for mammales (4)
* in their relative M.reevesi, which had 9 chromosomes, they cosely compared the two using in situ hybridisation probes of telomeres.
* They found that M.gongshanensis shows chromosomes fusion because all their telomeres where fused into the middle section of the chromosomes

source is from Huang et al., 2006

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Intro to Genome Evolution

What is another example of how translocation during male meiosis and homolopgous recombination can lead to chromosome evolution?

A

Multiple sex chromosomes in platypus. 5X and 5Y form a chain of sex chromosomes during male meisosi and homologous recombination.
1. Formation of this chain is due to transocationr ecombination between a sex chromosome and an autosome
2. During evolution, a translocation between end of a Y chromosome and autosome caused a part of the Y chromosome (Y1) to become homologus w the autosome it translocated with.
3. The rest of the Y1 chromosome is still homologous to the original X1 chromosome.
4. The autosome it translocated with then becomes the X2 chromosome, and then autosome X2 is then homologous to another autosome which becomes the neo Y2. As this goes on, it ends up with 5 X and 5Y where some were originally autosomes.
5.  As translocation happens again in the next round, the sex chromosomes increase, and autosomes form the neo-X/Y chromosomes, and cause the elongation of the chain of sex chromosomes during homologous recombination during male meiosis
6. Suprisingly the formation of this chain still manages to segregate correctly during meiosis.
7. Multiple sex chromosomes also seen in the dioecious plant S.diclinis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Intro to Genome Evolution

How did sex chromosomes evolve

A
  • Sex chromosomes evolved multiple times independently
  • sex chromosomes in birds and mammals arose independently at ~170 and 100 MYA.
  • the 2 independent evolutions of sex chromosomes is what led to mammalls being male heterogamety (XY for males) while birds being female heterogamety (ZW for females)
  • Sex chromosomes evolve from pair of autosomes that acquired a sex determining gene and stop recombining with each other. The non recombining region then expand by inversions resulting the entire Y chromosome region becoming non-recombinant
  • this expanding process is called EVOLUTIONARY STRATA on sex chromosomes
  • through inversions the genomic region becomes non-recombining and the regions start to accumulate deleterious mutations and gradually become degenerate
  • they accumulate deleterious mutations because: no recombination = lack of genetic diversity = hence unable to remove the mutations from the population via natural selection and mutations start accumulating
  • therefore the Y chromosome are usually emtirely degenerate in most organisms
  • this is called the Y chromosome degeneration Evolution Strata
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Intro to Genome Evolution

What is the Y chromosome Evolutionary Strata and what causes it to be degenerate? What about Y chromosome gene loss?

A

Y chromosomes slowly beome degenerate. The process of this occurs at different rates, and studies have investigates how fast genes were lost/become degenerate once the Y chromosome stops recombining.
The study did this in 8 mammalisan species, and showed that
* genes are lost very quickly (almost immediately ar long evolutionary timescale) once a region becomes non-rocombining, and only a few indispensable/highly conserved genes reamain functional on Y chromosome
* They constructed a evolutionary dynamics of gene loss, and showed thgat natural selection cannot work effectively in non-recombining regions.
* this results is graudal loss of genes from Y and W
* The number of genes in human Y chromosome has reached a base level, with little further degeneration present
* Y chromosome genes unlikely to be lost al together because it is a non-linear graph. The loss of genes at the ends are very slow (more conserved)
* Ongoing Y-chromosome degeneration

Nature 2014

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Intro to Genome Evolution

Describe the example of Vitamine C synthesis gene loss in animals

A
  • loss of vitamin C synthesis ability + related genes independently occurred
  • many birds + mammals cannot synthesis Vit C
  • the inability to synthesize vitamin C is due to mutations in the L-gulono-lactone oxidase (GLO) gene that encodes the enzyme responsible for catalyzing the last step of vitamin C biosynthesis.
  • It is thought that the loss of this gene occurs whenever there is sufficient vitamin C is present in food. This is a general tendency to lose genes that become unnecessary.

Current Genomics 2011

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Intro to Genome Evolution

Describe gene loss for loss of teeth in birds turtles and mammals

A
  • loss of genes = unnecessary
  • genome comparison study showed mineralized teeth in birds were lost in 120mya
  • in mammals like toothless whales, lost teeth due to loss of genes for making teeth

Mereditch et al., 2004

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Intro to Genome Evolution

What often causes loss of genes

A

when genes become unneccessary
* teeth in some birds
* vit C synthesis in birds + mammals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Intro to Genome Evolution

How does a study on Fungi genomics show that genes are being lost and gained all the time?

A

Within many related species of fungi, genes are constantly being lost, and duplicated..etc
even whole genome duplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Intro to Genome Evolution

what are some mechanisms for where do new genes come from

A
  • Exon shuffling: ectopic (abnormal) recombination of exons and domains from distant genes
    i.e jingwei,
  • Gene duplication: classic model of duplication with divergence
    i.e CGβ, RNASE1B
  • Retroposition: new gene duplicates are created in new genomic positions by reverse transcription or other processes
    i.e PGAM3
  • Gene fusion/fission: 2 adjacent genes fuse into a single genes, or a single gene splits into two genes
    i.e Fatty-acid synthesis enzymes
  • De novo origination, a coding region originated from a previously non-coding region
    i.e AFGPs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Intro to Genome Evolution

How do new proteins evolve?

A

apart from new genes -> new proteins
* many proteins evolve by ‘borrowing’ domains from other proteins
* this is done by exon shuffling: proteins can shuffle other protein domains (which corresond to certain exons) to add domain or function to existing protein
* this can reduce, change, or add function to protein
* or this can be done by alternative splicing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Intro to Genome Evolution

What is an example of exon shuffling

A

jingwei gene in drosophila

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Intro to Genome Evolution

What does exon shuffling change

A

it allows the evolution of new proteins
or proteins with new function/changes to protein function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Intro to Genome Evolution

Describe to the exon shuffling in terms of the origin of jingwei gene in drosophila

Nature genetics 2003

A
  • Jingwei Gene in drosophila originated as a gene duplication of ancestral gene ‘yande’
  • this was then followed by a exon shuffling or retroposition of Alcohol dehydrogenase (adh) gene into the middle of yande, creating a gene fusion
  • the new chimeric gene was created, consisting of 3 exons of yande and the middle exon a coding region of Adh
  • the new gene gained functions of Adh, being and alcohol dehygrogenase, but the new gene works more effectively for longer chains of alcohol molecules
  • this is a gene creation of a new gene ‘jingwei’ w sub-functionalisation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Intro to Genome Evolution

What do you call when a gene gains a completely new function and a gene that only gains some new functions

A
  1. neo-functionalisation: creation of gene w completely differet and new functions to old gene
  2. sub-functionalisation: Creation of a gene with similar function to the old gene
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Intro to Genome Evolution

how did the evolution of introns occur?

A
  • 2 main theories: Intron first and Intron Late
    1. Intron first: evolution of introns in RNA world; Introns are very ancient and are gradually lost (e.g. lost completely in bacteria)
    2. Intron late: introns evolved in the ancestor of eukaryotes; Introns evolved in early eukaryotes and keep spreading
  • across tree of life, clearly introns are lost and gained all the time, but exactly which one came first is still unclear
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Intro to Genome Evolution

How did alternative splicing occur?

A
  • major role in evolution of new gene functions/new proteins
  • existence of introns and genes being in ‘pieces’ of exons and introns allow alternative splicing of same gene into different proteins depending on mRNA splicing
  • alternative splicing evolution is thouh to be a by product of splicing noise
    1. imperfect or incorrect splicing that occasionally occurs
    2. If the resulting new combination of exons is advantageous, selection can make that splicing variant more likely to occur.
    3. Selection changes the relative abundances of the results of proteins, Can be done through changing promoters..etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Intro to Genome Evolution

advantages of alternative splicing

A
  • major role in evolution of new gene functions/new proteins
  • existence of introns and genes being in ‘pieces’ of exons and introns allow alternative splicing of same gene into different proteins depending on mRNA splicing
  • doesn’t require completely new synthesis of an exon, one copy is sufficient
  • hence saves energy to copy exon
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Intro to Genome Evolution

what is the 2R hypothesis

A

2 rounds of whole genome duplication in animals, specifically in vertebrate ancestry
* hypothesis that vertebrates originated after 2 rounds of WGD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Intro to Genome Evolution

why is gene duplication important in evolution

A
  • Evolution by gene duplication = major source of new genes and evolutionary novelty
  • Involves duplication of individual denes and sometimes entire genomes (like the fungi example)
  • WGD - whole genome duplication in animals thought to have driven originated of vertebrates
  • WGD much more common in plants and fungi
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Intro to Genome Evolution

what happens in gene duplication, and how does it effect protein/gene function?

A

* Once a gene is duplicated, some functional redundancy is created, reduces purifying selection and allows the two copies to accumulate mutations and diverge in function.
* Often duplicated copies perform very similar roles, but in a slightly different way
* as was the case for Adh and jingwei genes in Drosophila The two copies of a gene can specialise (and be optimised) to work in different tissues or in slightly different ways.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Intro to Genome Evolution

what are possible outcomes of genes which have been duplicated

A
  1. sub-functionalisation
  2. neo-functionalisation
  3. specialise and optimsed to work in different tissues or via different mechanisms
  4. can also be selected against and be removed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Intro to Genome Evolution

What is an example of gene duplication leading to subfunctionalisation

A
  • evolution of trichromatic color vision in primates
  • ancestral state is dichromatic, because ancestors were nocturnal and color vision wasn’t necessary for that
  • primates are diurnal (work in day) and rely on cokor vision to find ripe fruits, hence this trait is more useful
    * Evolution of trichromatic colour vision in primates occurred as a result of gene duplication: the L- gene (for Long wave length) was duplicated and the resulting genes diverged little bit, resulting in L- and M-genes (for Medium wave length).
  • after gene duplication which created L and M opsin genes, the duplication copies diverged to acquire different spectral sensitivities
  • In humans spectral sensitivity is relatively poor, in comparison to bees and birds which civer nearly equally the entire visible ligjt wavelength
  • in birds with better vision, they have evolved to have high spectral sensitivity across all wavelengths, and evolved a new VS opsin gene

2010 paper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Intro to Genome Evolution

what causes increase and decreasein genome sizes

A
  1. gene duplication
  2. genome duplication
  3. spread of transposable elements

ONLY DELETION causes downsize in genome, frequency and size of deletion determine eficiency of genome downsizing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Intro to Genome Evolution

Are there any links to genome size and number of proteins

A
  • genome expansion cause genome sizes vary over several orders of magnitude.
  • Does not mean that the bigger genomes contain more genes and could encode for more complex organisms:
    * Larger genome IS NOT EUQAL to more genes DOES NOT EQUAL TO more complexed organisms
  • Proof: the genomes of various flowering plants (that have relatively similar ‘design/function’ and complexity) vary over three orders of magnitude, suggesting that the size of the genome has little to do with organism complexity.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Intro to Genome Evolution

what is an example that shows genome size does not mean more complexed organism

A

Proof: the genomes of various flowering plants (that have relatively similar ‘design/function’ and complexity) vary over three orders of magnitude, suggesting that the size of the genome has little to do with organism complexity.

  • also the C paradox shows this as well, eukaryotic genome size is not linearly correlated to its number of genes, this is because the existence of junk DNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Intro to Genome Evolution

what is the C-value paradox, define

A
  • The number of genes and the genome size show good correlation in viruses and prokaryotes, correlation is much weaker for eukaryotes – the so called “C-value paradox”.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Intro to Genome Evolution

what is the simplified reason for C-value paradox

A
  • The reason for this is the abundance of non-coding DNA in eukaryotic genomes.
  • More DNA doesn’t mean more genes in eukaryotes due to existence of introns
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Intro to Genome Evolution

genome size varies example?

A

range very quickly across multiple species from 200mya. shows the variavility of the genome even in closely related species. Also showing genome size change = changes in function hugely

Organ et al., 2007

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Intro to Genome Evolution

How did they estimate the genome sizes for dinosaurs

A

Genome sizes for extinct animals was measured from the size of cells inside the bones – the larger the genome, the larger the cell. For dinosaurs the size of the cells was measured from the size of pores in the bones.

This revealed that genomes of dinosaurs and mammals are relatively large, while bird genomes appear to have been downsized, possibly as an adaptation to faster lifecycle and flight.

organ et al., 2007

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Intro to Genome Evolution

what might be an explanation for larger genome larger cell?

A

the size of genome had structural roles to maintail cell size and cell shape in animals
large cell size and genome for dinosaurs and mammals which are physically larger and have bigger cells, and cell size and genome are smalle for birds (which are smaller)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Intro to Genome Evolution

How does transposable elements effect evolution of genome size

A
  • Transposable elements are a major component of most genomes:
  • > 50% of humans and > 70% of wheat genome is comprised of various TE,
  • Activation of jumping gene transposable elements can quickly cause the double the genome size and cause increase of genome sizes very quickly
  • Activation of a single family of TE can lead to rapid increase of genome size, as was reported for S.latifolia and cotton
  • Example 1: Silene latifolia: the spread of a TE (SIOgr1) family ~5mya increased it’s genome size from 2Gb to 3.5Gb (comparing S.latifolia and its most recent relative S.vulgaris)
  • Example 2: Gossypium (cotton) family: The four fold difference in genome size between closely related species: from 880Mb in G.raimondii to G.exiguum (2460Mb)
  • because of the spread of the TE family Gorge 3 (which increased from 61Mb in G.raimondii to 831Mb in G.exiguum)

Filatov et a;., 2008 ; Hawkins et al 2009

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Intro to Genome Evolution

what are 2 example fo how TE jumping genes effect genome size

A

activation of TE leads to rapid increase in genome size in:
1. Example 1: Silene latifolia: the spread of a TE (SIOgr1) family ~5mya increased it’s genome size from 2Gb to 3.5Gb (comparing S.latifolia and its most recent relative S.vulgaris)

  1. Example 2: Gossypium (cotton) family: The four fold difference in genome size between closely related species: from 880Mb in G.raimondii to G.exiguum (2460Mb); because of the spread of the TE family Gorge 3 (which increased from 61Mb in G.raimondii to 831Mb in G.exiguum)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Intro to Genome Evolution

If deletions are the only way to keep genome downsizing, how is this done efeectively?

A
  • **The frequency and size of deletions occurring in the genome determines how efficient is genome downsizing. **
  • A study analysed and compared the frequency and size of deletions occurring in a small genome of Drosophila and a very large genome of cricket Laupala.
  • This showed that Drosophila has frequent deletions and many of them are relatively long (>16 nucleotides).
  • On the other hand, the cricket had fewer deletions and most of them very short.
  • This study demonstrated that half-life of a piece of junk DNA in Drosophila is only 14 million years, while in the cricket it is over half a billion years
  • effectively junk DNA is never removed from Laupala genome.
  • This is likely to be the reason why these species have so different genome sizes.
  • In constant process of adding and removing genes, and rate and size of removing and adding is all varable
  • From large genome junk DNA is almost never removed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Intro to Genome Evolution

why does it take so long to remove a junk DNA from a very large genome

A

Maybe bcuz of their large genome, the junk DNA doesn’t have enough negative selection pressure to remove it. Too much DNA to process, so not worth removing it from evo perspective?? UNSURE THO!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Intro to Genome Evolution

What is are example of extreme genome reduction?

A

Buchnera
* Buchnera is a mutualistic intracellular symbiont of aphids.
* Their association began about 200 million years ago, with host and symbiont lineages co-evolving in parallel since that time.
* During this coevolutionary process, Buchnera has experienced a dramatic decrease of genome size (from ~4Mb to ~0.5Mb genome), retaining only essential genes for its specialized lifestyle – essentially majority of all biochemical pathways are removed
* for better adaptation as an intracellular symbiont, and doesnt require processes which it can rely on host for
* Lost because selection is not keeping them intact
* Strong selection for non-essential genes due to symbiotic lifestyle (intracellular)
*Miochondria
* similarly, mitochondria as a symbiont in endosymbiosis theory, also became a symbiont
* mitochondria has a super reduced genome of only 16kb long, having essential mitochondria specific genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Intro to Genome Evolution

Genome comparison of buchnera for evidence of genome reduction

A
  • The comparison of genomes of two buchnera species that diverged 50 million years ago revealed very similar gene content,
  • Buchnera has undergone genome reduction a long time ago and little has changed since then
  • it is effectively in genomic static, with no signs of further genome reduction.
  • genome unlikely to be reduced further
  • Maybe due to the remaining ones are VERY ESSENTIAL to life and further reduction won’t have any more evolutionary benefits for its current lifestyle and environment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Intro to Genome Evolution

what causes multiple sex chromosome evolution

A

homologous recombinantions between autosomes and original sex chromosomes, this causes multiple sex chromosomes,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Intro to Genome Evolution

what caused the first evolution of sex chromosomes

A

a pair of autosomal chromosomes acquired sex determining genes, and becomes non-recombining through increasing mutations which prevent them from recombining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Intro to Genome Evolution

how do proteins evolve

A

alternatie splicing
exon shuffling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

impacts of gene duplication

A

divergence in functionality and results in sub functionalisation or neofunctionalisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Intro to Genome Evolution

what does larger genomes mean?

A

IT DOESNT MEAN MORE COMPLEXED ORGANISM OR MORE PROTEINS
but it can be related to cell sizes and life-strategy of organisms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Intro to Genome Evolution

what do Transposable elements do?

A

when activated, they can increase genome size very rapidly by creating lost of junk DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Intro to Genome Evolution

what is genome size dependent on, and what varies genome size

A

increases: duplication, TE, …etc
Deletion rate depends on frequency, size of genome, and selection pressure. Larger genomes often is harder to remove junk DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Intro to Genome Evolution

Extreme examples of genome reduction (brief)

A
  • Genome reduction in mitochondria and symbionts like Buchnera are extreme examples, as their genomes have been reduced too an extreme amount due to their intracellular and symbiotic lifestyle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Lecture 2 (HG) intro to human evo genomics

what is the importance of poulation genetics in human evolution study

A
  • Addresses questions about recent evolution
  • Where did humans originate?
  • When have humans spread across the world?
  • While spreading, have humans been adapting to a diverse set of environmental conditions?
  • Have humans interbred with other closely related species, such as Neanderthals?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Lecture 2 (HG) intro to human evo genomics

how is MtDNA diversity used to research evolutionary genetics in human

A

MtDNA more useful than nuclear DNA because it is non-recombining in humans

mtDNA: High copy number per cell, small genome (16kb), High mutation rate, No recombination

easy and cheap to research/sequence

* mtDNA is not recombining and hence can build phylogenies within species, consistent trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Lecture 2 (HG) intro to human evo genomics

what is the benefit of using MtDNA for studies?

A

MtDNA more useful than nuclear DNA because it is non-recombining in humans

shows the ‘maternal’ side pof the story

mtDNA: High copy number per cell, small genome (16kb), High mutation rate, No recombination

easy and cheap to research/sequence

* mtDNA is not recombining and hence can build phylogenies within species, consistent trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Lecture 2 (HG) intro to human evo genomics

results of MtDNA diversity study in human evolutionary genetics

A
  • African genetic diversity significantly higher than others
  • Conclusion of this study (building phylogenetic tree from different races) resulted in that the root of the mtDNA phylogeny is in Africa
  • Consistent with the fact Africa has the biggest genetic diversity
  • Whole genome mtDNA comparisons gave same conclusion (using molecular clock)

LIMITATION: selective sweep of a single strongly advantageous mutation would produce same appearance, might not neccessarily be migration from africa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Lecture 2 (HG) intro to human evo genomics

Use of Y-chromosome to study human evolution + migration

A
  • non-recombining
  • independent evidence about the human history as it is unlinked to mtDNA.
  • paternally inherited and allows us to look at the ‘male history’,
  • Y- based phylogeny is consistent with mtDNA. It also has a root in Africa and African lineages are most diverse
  • Supports previous hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Lecture 2 (HG) intro to human evo genomics

Use of autosomal markers to investigate human evo gen + migration

A
  • However using austosomes (nuclear genes) is a problem as they do recombine
  • Difficult to form the phylogenetic tree
  • But Principle Component Analysis (PCA) can be used to analyse polymorphisms in autosomal DNA sequences.
  • Recombination leads to independence of evolutionary histories of different genes hence more unique data
  • provided further support to the idea that Africa is the source of all modern humans
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Lecture 2 (HG) intro to human evo genomics

how do nuclear gene and mtDNAs differ in providing in formation about human ancestry

A
  • Timescale:
  • Nuclear genes – deeper phylogenies explore human ancestry
  • mtDNA would only give us the most recent common ancestor for humans, but nothing further due to the single mitochondrial lineage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Lecture 2 (HG) intro to human evo genomics

How does selection impact polymorphism

A
  • lower recombination rate, means stronger hitchhiker effect, neighbouring genes of the selected gene is more likely to stay
  • hence The spread and fixation of an adaptive allele results in loss of genetic variation around the loci of the target allele (the allele around it will also be preserved along w the advantageous allele)
  • Size of the region affected by selection sweep depends on recombination rate
  • i.e in Y chromosome where no recombination occurs: entir chromosome will lose genetic variation after an advantageous (adaptive) allele is fixed in the chromosome
  • i.e If frequence recombination: only a short region around the adv allele will be fixed after the adv allele is fixed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Lecture 2 (HG) intro to human evo genomics

What is Fst and how does it work?

A

The fixation index (FST) is a measure of population differentiation due to genetic structure

FST is the proportion of the total genetic variance contained in a subpopulation relative to the total genetic variance, ranges from 0-1

low Fst means a lot of gene flow and breeding and connectivity between subpopulations, keeping it similar to the overall genetic variance
high Fst above 15% means that this sub-division is differentiated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Lecture 2 (HG) intro to human evo genomics

what happens to DNA polymorphism after multiple selective sweeps

A

After 1st selective sweep, it leads to surrounding genes of the adv allele to decrease in genetic variation
as new mutations accumulate, genetic variation in that area will recover due to new mutations arising causing more genetic diversity/variation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Lecture 2 (HG) intro to human evo genomics

how can biased new selective sweeps be detected?

A

using a statistics called Tajima’s D, this is because ALL new mutations are at very low frequencies, this bias allows them to be detected by the stats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Lecture 2 (HG) intro to human evo genomics

what effect does 2 contrasting conditions have on alleles in populations

A

Adaptation to contrasting conditions (e.g. high/low altitude) leads to spread and fixation of different locally adaptive alleles in the populations.
this is a typical footprint of local adaptation to identify genetic variants which are evolving under this type of selection pressure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Lecture 2 (HG) intro to human evo genomics

how to measure differentiation of local of subpopulation in a quantifiable way

A

Fst statistic = (Ht - Hs)/Ht
Ht = total heterozygosity across all populations, and Hs is heterozygosity within populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Lecture 2 (HG) intro to human evo genomics

How to identify whether a local population has adapted to a environmental changes

A

Identifying local adaptation can be done through population differentiation to different environments which leaves a ‘signature’ as their genetics between the 2 diff environments would be different,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Lecture 2 (HG) intro to human evo genomics

what is an example of human adaptation to environments

A

In a study of Han chinese and Tibetan population differentiation, the EPAS1: has the strongest and most obvious differentiation to other genes: it has a very high Fst. This means that this gene has the strongest signal for population differentiation and locally adapted gene EPAS1.

this gene EPAS1 in tibetans are ery divergent from other hapotypes in other human populations, and was revealed to have been inherited from Denisovan genome, during early interbreeding. Even if early interbreeding was rare, the provided genetic diversity can be very beneficial and advantageous and can spread through natural selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Lecture 2 (HG) intro to human evo genomics

what is the neutral theory?

A

It suggests that most evolutionary changes at the molecular level (such as changes in DNA or protein sequences) and genetic variation/diversity are not caused by natural selection acting on advantageous traits. Instead, these changes are the result of random genetic drift of mutant alleles that are neutral.

rare beneficial mutations occur and selective sweep the occurs rapidly to fix these beneficial mutations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Lecture 2 (HG) intro to human evo genomics

what is balancing selection

A

a type of natural selection where genetic diversity is maintained within a population. Unlike directional selection, which favors a single allele and can lead to a decrease in genetic diversity over time, balancing selection ensures that multiple alleles are preserved at a particular gene locus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Lecture 2 (HG) intro to human evo genomics

what are some mechanisms of balancing selection?

A
  1. Heterozygote Advantage (Overdominance): when being a heterozygote (hence having 2 diff allele copies) give you adv. For example, in africa, adv to have heterozygote sickle cell anemia (Hbs/HbA), as it protects you from malaria, and is not completely fatal
  2. Frequency-Dependent Selection: fitness of a phenotype depends on its frequency relative to other phenotypes in the population. There are two types: positive frequency-dependent selection, where the fitness of a phenotype increases with its frequency, and negative frequency-dependent selection, where the fitness of a phenotype decreases as it becomes more common. Example: coloration to avoid predation. If one coloration becomes popular, predator will recognise and predate more of the same color. Hence in this case fitness of phenotype decreased as phenotype becomes common. Hence allows multiple alleles for colors to be present in population
  3. Disruptive selection: Not neccessarily balancing selection., but can contribute to maintaining multiple alleles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Lecture 2 (HG) intro to human evo genomics

how can the rate of selective sweep change depending on the type of adv mutation?

A
  1. all mutations are rare to start with
  2. if adv mutation occured which was dominant, then selective sweep may occur quicker
  3. if adv mutation was recessive, it would take much longer as it would have to meet another recessive first.
  4. if they were linked to another dominant beneficial allele, it can spread quicker through hitch-hiking
  5. or spread wuicker through migration, genetic drift, funder’s effect…etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

Lecture 3 HG: Molecular Phylogenetics

what is phylogenetics

A

Reconstructing patterns of shared ancestry among organisms (within or between species), really about ancestry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

Lecture 3 HG: Molecular Phylogenetics

what is taxonomy

A

Taxonomy: describing, naming, identifying and classifying species, grouping organisms into groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Lecture 3 HG: Molecular Phylogenetics

what is phylogeny

A

the evolutionary history, ancestry and relationships between groups of organisms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Lecture 3 HG: Molecular Phylogenetics

how do we now represent phylogenies

A

we combine phylogenetics and taxonomy together to create phylogenies.
* modern taxonomy classification use phylogenetics techniques too
* phylogenetic trees: evolution of phylogenies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Lecture 3 HG: Molecular Phylogenetics

what was the begining of modern phylogenetics era

A

DNA sequence date, useing DNA sequence of cytochrome c phylogeny to build phylogeny trees by looking and comparing mutations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Lecture 3 HG: Molecular Phylogenetics

Definition of homology

A

similarity essentially.

the state of having the same or similar relation, relative position, or structure:many proteins show homology across their whole length|a region ofhomology withanother gene.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

Lecture 3 HG: Molecular Phylogenetics

why is phylogenetic based on the principle of homology

A

because when comparing organisms in terms of evolution, their characteristics can either be homologous or analogous.

Characteristics of organisms are homologous if they are similar and have descended from a common ancestor.

Characteristics are analogous if they are similar but have descended from different ancestors.

i.e Bird and bat wings are homologous when considered as forelimbs, but analogous as wings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

Lecture 3 HG: Molecular Phylogenetics

what is phylogenetics based on, from the very fundamental level

A

homology, by comparing similarities and differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

Lecture 3 HG: Molecular Phylogenetics

What is molecular phylogenetics

A

Using molecular sequences which contain information about evolutionary history to build phylogenetic trees.

Information is often hidden, or fragmented in DNA
hence modern phylogenetics use stats, and technology to try recover and interpret information from DNA about phylogeny and evolutionary history

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Lecture 3 HG: Molecular Phylogenetics

what types of conclusions/descriptions do sequence comparisons give about evolutioanry relationships?

A
  1. Homologous sequences: sequences have a shared common ancestry, and are related. Very broad term. umbrella term for sequences that are related by descent from a common ancestral sequence.
  2. Orthologous Sequences: In different species, occurs ater a speciation event. They are sequences which were inherited from same ancestor, but then they speciated and diverged. Ortholog genes in different species have not undergone gene duplication, and remain tio have similar function
  3. Paralogous sequences: sequences of genes of 2 diff species which are related through gene duplication events in the same genome. Evolved new function, new gene from old gene.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

Lecture 3 HG: Molecular Phylogenetics

what are simple descriptions of homologus, orthologous, paralogous genes/sequences

A

Homologous is the broad category indicating genetic relatedness due to common ancestry.
Orthologous genes diverge after a speciation event, leading to similar genes in different species.
Paralogous genes result from gene duplication within the same organism, potentially leading to genes with new or specialized functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

Lecture 3 HG: Molecular Phylogenetics

WHy do we use molecular characteristics for phylogenetics instead of morphological ones

A
  • Molecular characters have many advantages over morphological ones:
  • Very common
  • Objective, easy to quantify
  • Available when morphology is uninformative (micro-organisms)
  • Cheap, fast
  • Can be obtained without specialist training

However phylogenetics has many cases where morphological and molecular data initially disagreed, andthis led to progress of phylogenetics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

Lecture 3 HG: Molecular Phylogenetics

What is the one significant disadvantage about molecular sequences

A

unavailable in extinct species or fossils

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

Lecture 3 HG: Molecular Phylogenetics

how was the 3 domains classified/made

A

bacteria, archaea and eukarya phylogenetics tree was constructed using rRNA sequences. 163 rRNA and 18srRNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

Lecture 3 HG: Molecular Phylogenetics

what are examples of when molecular data and morphological data initially disagreed

A
  1. The Placement of Whales: morphology like marine, but molecular data = mammals
  2. funghi classification: morphology = plant like, molecular = more related to animals
  3. protists = grouped together due to morphological characteristics of being ‘animal-like’ and ‘plant- like’, but molecular data revealed = paraphyletic lineages were included. Hence still very much a ‘dump’ classification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

Lecture 3 HG: Molecular Phylogenetics

what are the types of mutations

A
  1. transition mutation: purine to purine, pyramidine to pyramidine, A-G, C-T, quite common
  2. transverse mutations: purine to pyramidine, rare (less freq than transitions)
  3. silent/synonymous mutations: encoded amino acid is unchanges, 70% in 3rd position of codon dont change amino acid sequence at all (redundancy)
  4. replacement/non-synonymous mutations: encoded amino acid is changed (can cause selection pressure)
  5. insertion: addition of one or more nucleotides to a sequence
  6. deletion: removal of one or more nucleotides form a sequence
  7. indels cause nonsense mutations, replacemnt can cause missense
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

Lecture 3 HG: Molecular Phylogenetics

give a brief overview of the process of constructing a phylogenetics tree

A
  1. First obtain molecular sequences
  2. using alignment methods, align the sequences correctly
  3. then using sequence evolution models, to work out the genetic distance between the sequences
  4. then using phylogenetic methods, ypu can build a evolutionary tree where time scale - genetic distance
  5. then using molecular clock models to build an evolutionayr tree, timescale = years
  6. then either using coalescent thoery: to get population level process, or macroevolution models to get species level processes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

Lecture 3 HG: Molecular Phylogenetics

examples of population-level processes and species level processes

A

population level processes changes to population of single species over time
* natural selection
* genetic drift
* gene flow
* mutation
* sexual selection
species level processeschanges that affect the emergence, evolution, and extinction of species.
* speciation
* extinction
* adaptive radiation
* coevolution
* hybridization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

Lecture 3 HG: Molecular Phylogenetics

what are ways of sequence alignment methods

A
  1. BLAST can be used for MSA and general matches
  2. but other algorithms such as clustal and muscle may be better
  3. Global alignment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

Lecture 3 HG: Molecular Phylogenetics

why is BLAST not

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

Lecture 3 HG: Molecular Phylogenetics

Why do we need to align sequence

A

Because there are multiple ways to align and compare sequences. Depending on the way of alignment, the interpretation of the sequences will be different, giving potential incorrect evolutionary histories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

Lecture 3 HG: Molecular Phylogenetics

what is molecular sequence alignment

A
  • molecular sequence alignment is based on the concept of positional homology.
  • nucleotides or aa have positional homology if they exist at equivalent positions in their compared sequences
  • A set of nucleotide or amino acid sequences is converted into an alignment by proposing positional homologies for each site.
  • There are many possible ways to align and compare a sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

Lecture 3 HG: Molecular Phylogenetics

what are 2 methods of sequence alignment

A
  1. multiple sequence alignment (MSA): alignment of three or more sequences of similar length.
  2. Global alignment: a method used to align two sequences from beginning to end, maximizing the number of matches and minimizing the number of mismatches and gaps across the entire length of the sequences. It’s a type of pairwise sequence alignment.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

Lecture 3 HG: Molecular Phylogenetics

describe the Multiple Sequence alignment methods

A
  1. for multiple sequence comparisons
  2. helps iidentify conserved sequences across multiple organisms, which may be indicative of functional or structural importance.
  3. understanding phylogenetic relationships, predicting the function of unknown proteins, and identifying conserved motifs.

Challenge:
as number of sequences increase, more computationally challenging to align

Example:
CLUSTAL and MUSCLE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

Lecture 3 HG: Molecular Phylogenetics

Global alignment

A
  • The goal is to find the best possible alignment that includes all characters from both sequences, which is particularly useful for comparing sequences of similar length and identifying overall similarities and differences.
  • Needleman-Wunsch algorithm, systematically compare all possible alignments and select the one with the highest score based on a scoring matrix.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

Lecture 3 HG: Molecular Phylogenetics

compare and contrast MSA and Global alignment methods, and pros and cons

A
  1. MSA is used to align three or more sequences and is essential for analyzing conserved regions across multiple sequences, while global alignment is designed for comprehensively aligning two sequences from start to finish.
  2. MSA is widely used in evolutionary studies, functional annotation, and identification of conserved motifs across multiple sequences. Global alignment is more suited for comparing two sequences in their entirety, such as when determining the overall similarity between two genes or proteins from different species.

MSA pros:
* useful in evolution study of multiple organisma and phylogenetic relationships
* identify conserved regions

MSA cons:
* Gap Penalty Ambiguity: length of indels may affect how valid it is, especially in low similarity regions
* highly divergent and varied lengths can be difficult to align
* computational limitations: if too many sequences

GA pros:
* Needleman-Wunsch, which systematically explore all possible alignments to find the optimal one.
* Complete Alignment: It aligns two sequences from beginning to end, useful for closely related sequences of similar length.
* Scoring System: The use of a scoring system (for matches, mismatches, and gaps) allows for quantifiable comparison of alignments, making it easier to assess the quality of the alignment.

GA cons:
* only pairwise, so less suited for evolutionary and phylogeny studies
* varying lengths with large indels are difficult to compare and align

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

Lecture 3 HG: Molecular Phylogenetics

How do most alignment methods work?

A

by assigning a different “cost” to each type of sequence difference (transitions, transversions, insertions, deletions etc).
Using algorithms calculate costs, each possible alignment therefore has a total cost.
Then identify the algorithm with the lowest cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

Lecture 3 HG: Molecular Phylogenetics

compare and contrast clustal and MUSCLE

A

clustal
* clustal is algorithm for MSA
* first use scoring system for pairwise comparison between sequences, then creating a guide tree from these sequences and making adjustments with pairwise and MSA considerations
* can align varying lengths and divergence (as long as not too diverged sequences

MUSCLE
* good for large datasets for high speed and accuracy
* 3 step model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

Lecture 3 HG: Molecular Phylogenetics

what is one limitation for sequence alignment algorithms

A
  1. if too diverged and low similarity, less accurate
  2. varying lengths can be dificult
  3. large dataset/too many sequences = computational complexity
  4. too large/too many indels can be difficult and lead to low accuracy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

Lecture 3 HG: Molecular Phylogenetics

Alignment to genetic distance: why do we need to measure genetic distance

A
  • to identify if they have undergone convergent or divergent evolution, and how many substitutions and mutations occured at each nucleotide/amino acid position
  • to measure the evolutionary process of sequences, not simply compare their differences and number of mismatch
  • also need to consider and calculate if they have gone through multiple substitutions and convergent evolution from A-C-A instead of A-A (p-distance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

Lecture 3 HG: Molecular Phylogenetics

How do we extimate how many substitutions actually occured in calculating genetic distances in phylogenetic tree construction

A

When divergence is low, the observed number of changes is similar to the true genetics distance

When divergence is high, the observed number underestimates the true genetic distance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

Lecture 3 HG: Molecular Phylogenetics

what is the multiple hits problem

A

same position of genome having multiple mutations over time, showing convergent evolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

Lecture 3 HG: Molecular Phylogenetics

what is p-distance

A

p-distance= Number of differing positions/ Total number of positions compared

proportion of differences, measure of genetic distance in evolutionary biology.

It quantifies the genetic difference between two sequences (DNA or protein sequences) by calculating the proportion of sites (nucleotide or amino acid positions) at which the sequences differ.

Limitations: The p-distance can underestimate the true evolutionary distance between sequences because it does not account for multiple substitutions at the same site (back mutations or parallel mutations).

For sequences that are highly divergent, use algorithms that account for multiple hits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

Lecture 3 HG: Molecular Phylogenetics

what is the nucleotide substitutiom model:

A
  • it is a mathematical model which aims to represent the processes of mutation and natural selection at the molecular level. It considers the probabilities of changes from one nucleotide (A, T, C, or G) to another across a phylogenetic tree.
  • describes the rate of nucleotide change from one to another over time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

Lecture 3 HG: Molecular Phylogenetics

what are nucleotide substitution models useful for?

A
  • Estimating Divergence Times: By calculating the rates of nucleotide substitutions, we can estimate how long ago two species or sequences diverged from a common ancestor.
  • Understanding Evolutionary Forces: These models can provide insights into the forces shaping genetic evolution, such as mutation rates, selection pressures, and genetic drift.
  • Evolutionary Inference: By understanding genetic distances, we can construct evolutionary histories to.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

Lecture 3 HG: Molecular Phylogenetics

What are some common models of nucleotide substitution models and how do they work?

A

*assign relative rates of different types of mutations
* Jukes-Cantor (JC) Model: The simplest model, assuming that all substitutions occur at the same rate and each nucleotide has an equal chance of changing into any other nucleotide. Assumes all mutations occur at same rate
* K2P model: Also quite simple, introduces 2 different rates for transition and transversion mutations
* HKY model: more advanced. More realistic model. takes into account of unequal base frequencies. Also it distinguishes different rates between pyramidine to pyra, and puri to puri. as well as if its pyuri to pyri, or pyri to puri.
* The geneal Time reverisble (GTR) model: The most general and complex model, allowing for different rates for all possible changes between nucleotides and different equilibrium nucleotide frequencies. GTR can encompass the simpler models as special cases. Supposes 6 different mutation rates for each case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

Lecture 3 HG: Molecular Phylogenetics

What should realistic nucleotide substitution models do?

A

Realistic models include the relative frequency of each nucleotide, i.e if lots of As, more likely to see A mutate than T mutate (looks at percentage mutation rather than abundance)

so far GTR is most flexible and most complex, as it supposes 6 diff mutation rates, and takes into account of frequencies of bases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

Lecture 3 HG: Molecular Phylogenetics

compare and contrast the simple JC models to more complexed GTR models. When would you use each?

A

**JC models: **
* Pros: easy to use, good if the dataset of sequences are limited and when the frequency of bases are largely similar and supposed similar substitution rates are also somewhat the same.
* Cons: over-simplification of the reality of mutations, in large datasets not very realistic and doesn’t reflect evolutionary history if the rates are different
GTR model
* Pros: reflects the evolutionary histories more realistically, more flexible and complex. More adaptabiloty for different substitution rates
* cons: overfitting: may over-interpret on noise data or small datasets. Requires mor computational complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
110
Q

Lecture 3 HG: Molecular Phylogenetics

what are Amino acid substitution models?

A
  • nucleotide models for DNA/RNA seqs, aa models for protein seq
  • These models are used to study the evolution of protein-coding genes by describing how amino acids change over evolutionary time.
  • calculating the probabilities of one amino acid being replaced by another in a protein sequence over time.
  • These models account for the fact that some amino acid changes occur more frequently than others due to factors like the physicochemical properties of the amino acids and the functional constraints of the protein. The models define rates of substitution for all possible pairs of the 20 amino acids.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
111
Q

Lecture 3 HG: Molecular Phylogenetics

what can you infer from substitution models

A

Understanding these changes helps scientists infer protein function, evolutionary relationships, and the dynamics of molecular evolution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
112
Q

Lecture 3 HG: Molecular Phylogenetics

how do protein substitution models work?

A
  • These models account for the fact that some amino acid changes occur more frequently than others due to factors like the physicochemical properties of the amino acids and the functional constraints of the protein. The models define rates of substitution for all possible pairs of the 20 amino acids.
  • hence it is a 20x20 matrix, for all 400 possibilities
  • These rates are obtained from large surveys of protein variation (not from your particular data set).
  • Equilibrium Frequencies: Most models also consider the equilibrium frequencies of the amino acids, which represent the expected frequencies of each amino acid during evolution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
113
Q

Lecture 3 HG: Molecular Phylogenetics

what are some common models of amino acid substitution models, and how do they work?

A
  • JTT Model: model uses a large database of known protein sequences to deduce rates. It adjusts the substitution rates and amino acid frequencies based on empirical data, making it useful for a wide range of evolutionary distances.
  • These rates are obtained from large surveys of protein variation (not from your particular data set).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
114
Q

Lecture 3 HG: Molecular Phylogenetics

Nucleotide substitution models vs Amino acid substitution models, pros and cons

A

amino acid models
* Pros: functional evolution of proteins, more suitable to study highly diverged protein sequences - can cpature more info when nucleotide saturation (multiple hits) occurs. More simple analysis looking at a larger scale rather than small synonymous changes in nucleotide.Directly understand selection pressures at a phenotypic/protein level
* Cons: Loss of Information: losing information about synonymous changes (which don’t alter the amino acid sequence) and potentially informative patterns in codon usage or RNA secondary structure.

Nucleotide models
* Pros: Detailed Evolutionary Insights, including synonymous mutations, which can be crucial for understanding selective pressures at the molecular level. Can construct higher resolution phylogenetic tree, when the protein models don’t produce detailed enough evolutionary differences
* Cons: Saturation Issues: For highly divergent sequences, saturation—where multiple hits occur—can make nucleotide models less effective over long evolutionary timescales. Only for short time scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
115
Q

Lecture 3 HG: Molecular Phylogenetics

when to use amino acid model and when to use nucleotide model?

A

amino acid model:
* when focus is on protein-coding genes/phenotype selections that have experienced significant evolutionary divergence,
* when interested in functional aspects of protein evolution,
* or when nucleotide sequences are so divergent that saturation obscures their evolutionary history.

nucleotide models
* when analyzing closely related sequences, (identifying phylogenetic relationships between closely related species)
* where synonymous changes are informative, or when studying non-coding DNA regions.

ultimately depends on whether interested in coding/non-coding, and how specific the researched evolutionary distance is, and how long/large the divergence/evolutionary history may be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
116
Q

Lecture 3 HG: Molecular Phylogenetics

what are the major biological assumptions and limitations of using substitution models

A
  • assumptions lead to errors when assumptions aren’t true
  • Evolution homogeneity: These models often assume that substitution rates are homogeneous/same across the entire sequence being studied. However, different regions of a gene or protein may evolve at different rates due to functional constraints or varying levels of selective pressure.
  • Independence: Another assumption is that substitutions at different sites occur independently of one another. In reality, the evolutionary process can be influenced by interactions between sites (epistasis), where the effect of a mutation at one site depends on the sequence at another site. Especially this assumption doesn’t consider 3D shapes and structures of proteins
  • Saturation in nucleotide: multiple hits problem, can’t infer long evolutionary history or hugely divergent sequences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
117
Q

Lecture 3 HG: Molecular Phylogenetics

How can the assumption of evolution heterogeneity in substitution models be fixed?

A

Use models of among-site rate heterogeneity (usually the gamma model) where it models not every site evolves at same rate

It incorporates the realistic scenario that not all parts of a sequence evolve at the same rate. Some regions might be highly conserved due to functional constraints, while others might evolve more rapidly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
118
Q

Lecture 3 HG: Molecular Phylogenetics

how does the gamma distribution model for among site variation work?

A

provides a distribution of sites having different evolutionary rates. This is done by adding an extra alpha parameter, to indicate if this site has a faster or slower variation rate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
119
Q

Lecture 3 HG: Molecular Phylogenetics

What are some phylogenetic methods which allow us to interpret genetic distances into making evolutoonary tree (time scale = genetic distance)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
120
Q

Lecture 3 HG: Molecular Phylogenetics

what does constructing phylogenetics tree with genetic distance tell us?

A

helps understand relationships among different species or genes
and also when their divergences occurred in terms of genetic distance
construct new branches and nodes on phylogenetic tree based on genetic distances/genetic changes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
121
Q

Lecture 3 HG: Molecular Phylogenetics

what are 2 ways to construct phylogenetic tree

A
  1. rooted tree: has evolutionary direction, and horizontal lines represent genetic distance
  2. unrooted tree: all lines represent genetic distance, and there is no evolutionary direction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
122
Q

Lecture 3 HG: Molecular Phylogenetics

what are phylogenetic methods

A

techniques used in evolutionary biology to infer the evolutionary relationships and history among groups of organisms, genes, or other units of biological interest.

These methods aim to reconstruct the “phylogeny” or evolutionary tree that represents hypotheses about the ancestral relationships and divergence events that have led to the current diversity of life.

They require other molecular information (like genetic distances) as a basis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
123
Q

Lecture 3 HG: Molecular Phylogenetics

Examples of phylogenetics methods

A
  1. UPGMA
  2. Neighbour Joining
  3. Maximum Parsimony
  4. Maximum Likelihood
  5. Bayesian Inference
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
124
Q

Lecture 3 HG: Molecular Phylogenetics

what data do phylognetetic methods/models require in advance prior to being able to generate a tree?

A
  • Different for different models.
  • UPGMA and Neighbour-Joining rely on genetic distance for algorithm to work, and hence require sequence alignment to make genetic distance and then use a matrix of genetic distances as a basis for the algorithm
  • others like Maximum likelihood, maximum parsimony and bayesian inference only require aligned sequences, but lther parameters have to be altered when running the algorithm
  • some programs can run multiples steps all together, from alignment to genetic distance to generating a tree. But choosing the best and most suitable model prior to constructing the tree is important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
125
Q

Lecture 3 HG: Molecular Phylogenetics

What additional steps should you take and be aware of when making a phylogenetic tree

A
  1. choosing right model based on sequences/dataset. i.e how divergent sequence is, how large data set is
  2. changing the parameters when using the models
  3. adding additional time estimates
  4. making sure prior alignment + genetic distance estimates are accurate
  5. adding in bootstrap value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
126
Q

Lecture 3 HG: Molecular Phylogenetics

what does a phylogenetic tree actually interpret?

A
  1. being able to look at divergences and homology of species
  2. evolutionary history and relatedness of species
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
127
Q

Lecture 3 HG: Molecular Phylogenetics

What ‘types’ of phylogenetic methods are there, how do you classify them?

A
  • classified by different ways
    Algorithmic/ distance based methods
  • These methods begin with a genetic distance for each pair of sequences. A ‘clustering algorithm’ then transforms the genetic distances into a tree.
  • e.g .UPGMA, Neighbour-Joining (NJ)

Optimality methods
* These methods define some kind of score for each possible tree. An optimisation algorithm to find the tree with the highest score., most optimal tree
* e.g. Maximum Parsimony (MP), Maximum Likelihood (ML) , Bayesian Inference

Statistical method
* These methods calculate a probability for each possible tree. They frame phylogeny estimation as a formal statistical problem
* e.g. Maximum Likelihood, Bayesian Inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
128
Q

Lecture 3 HG: Molecular Phylogenetics

what is molecular clock

A
  • molecular clock is not a separate method for constructing phylogenetic trees by itself, but rather a concept or parameter that can be integrated into various phylogenetic methods.
  • hypothesis: genetic mutations accumulate at a roughly constant rate over time in a given genomic region. In other words, the amount of genetic change (mutations) that occurs is proportional to the time that has passed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
129
Q

Lecture 3 HG: Molecular Phylogenetics

Types of molecular clock?

A

* Strict Molecular Clock: Assumes the same rate of mutation accumulation for all lineages being studied. This is a simpler but often less realistic assumption.

* Relaxed Molecular Clock: Allows for different rates of mutation accumulation in different lineages. This is more complex but often more accurate, as it accounts for the fact that different species or genes might evolve at different rates.

Local Clock: Rate varies, but is inherited, so adjacent branches have more similar rates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
130
Q

Lecture 3 HG: Molecular Phylogenetics

what is the application of molecular clocks in phylogenetics study/constructing phylogenetics tree

A
  • used to estimate the time of divergence between different species or lineages based on their genetic differences, hence constructing tree
  • a parameter that can change genetic distances to time estimates in tree
  • Because of this constant rate, the molecular clock can be used to estimate the time of divergence between different species or lineages.
  • By comparing the genetic differences between two species and knowing the rate at which mutations accumulate, scientists can estimate how long ago their common ancestor lived.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
131
Q

Lecture 3 HG: Molecular Phylogenetics

How to integrate molecular clock into phylogenetics in applications?

A
  • Rate Measurement: if a certain gene accumulates one mutation every million years on average, and two species differ by ten mutations in that gene, they likely diverged about ten million years ago. This can be done through measuring genetic distances
  • Calibration: To use a molecular clock effectively, it often needs to be calibrated with independent data, such as fossil records, geological events, or known evolutionary events. These calibrations provide reference points to determine the mutation rate. For example using a knwon divergence time of species
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
132
Q

Lecture 3 HG: Molecular Phylogenetics

what are some limitations of molecular clock?

A
  • Rate Variation: Not all genes or regions of DNA evolve at the same constant rate. Some may evolve more quickly or slowly, affecting the accuracy of the clock.
  • Calibration Accuracy: The accuracy of a molecular clock depends heavily on the accuracy of the calibration points used.
  • Evolutionary Pressures: Natural selection and other evolutionary forces can affect mutation rates, complicating the assumption of constant rates.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
133
Q

Lecture 3 HG: Molecular Phylogenetics

Molecular clock integration with Phylogenetic Methods:

A

Distance-Based Methods (e.g., UPGMA, Neighbor Joining):
* molecular clock can be an underlying assumption.
* For instance, UPGMA assumes a strict molecular clock (constant rate of evolution across all lineages), which can be a limitation.
* Neighbor Joining does not assume a strict molecular clock, making it more flexible and broadly applicable.

Maximum Likelihood (ML) and Bayesian Inference (BI):
* incorporate the molecular clock as an optional parameter in their models.
*use “relaxed” molecular clock models, which allow for variation in the rate of mutation accumulation among different lineages. This is more realistic for many datasets.
In Bayesian analysis, particularly, the molecular clock can be integrated with prior information to estimate divergence times along with the phylogenetic relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
134
Q

Lecture 3 HG: Molecular Phylogenetics

what are comparative methods?

A
  • comparative methods involve comparing various biological traits (like anatomical, physiological, or molecular traits) across different species or groups.
  • These methods are used to understand the evolutionary relationships and processes that have shaped these traits.
  • Comparative methods can be used to test hypotheses about evolutionary processes, like adaptation, co-evolution, or the impact of environmental factors on evolution.
  • used to investigate evolutionary processes after tree is constructed: for example determining divergent or convergent evolution by analysing tree
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
135
Q

Lecture 3 HG: Molecular Phylogenetics

What is UPGMA, how does it work, and its limitations

A
  • distance based/algorithmic methods
  • hence requires genetic distance in matrix and accurate alignment in advance
  • The distances measure how different the sequences are, which is assumed to reflect evolutionary time.
  • strict molecular clock is assumed, where assumes constant rate of evolution across all lineages (same rate of mutation)
  • constructs a phylogenetic tree by clustering taxa based on their pairwise distance. It begins with the closest pairwise distant pair of taxa and builds up the tree by sequentially adding branches.

Limitations:
*cannot compare for ‘best’ tree as this method only forms one tree.
* Assumes a constant rate of evolution, which is often unrealistic.
* Not as accurate for datasets where evolutionary rates vary.
* not accurate for highly divergent sequences

Pros:
* useful for quick analysis
* useful if the dataset fits assumption of molecular clock (where not much selection pressure is present, constant rate of mutation..which is unlikely)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
136
Q

Lecture 3 HG: Molecular Phylogenetics

What is Neighbour Joining, how does it work, and its limitations

A
  • distance based/algorithmic method
  • hence requires genetic distance + accurate alignment
  • NJ builds a tree by iteratively grouping the closest pair of taxa, but without assuming a constant rate of evolution
  • very similar to UPGMA but without assuming a strict molecular clock

Limitations:
* only produces one tree so cannot ocme up with a ‘more ideal’ tree to compare with
* Accuracy can diminish with highly divergent sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
137
Q

What is Maximum Parsimony, how does it work, and its limitations

A
  • Optimality based method
  • doesn’t require genetic distances (so no substitution model needed), only requires alignment
  • MP constructs a tree by minimizing the total number of evolutionary changes (like mutations) required to explain the observed data.
  • It compares all possible tree topologies and selects the one with the least changes.
  • uses a parsimony score: the minimum number of evolutionary changes/mutations required to explain the observed changes in sequences

Pros:
* fast
* doesnt require substitution models
* Most useful when applied to morphological character data.

Limitations:
* Can be misled by convergent evolution (independent evolution of similar traits)
* but inapplicable to fast-evolving or highly-divergent sequences.
* does not specifically account for different evolutionary rates/models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
138
Q

Lecture 3 HG: Molecular Phylogenetics

What is Maximum Likelihood, how does it work, and its limitations

A
  • optimality + statistical model
  • most commonly used
  • ML evaluates different tree topologies based on the probability of observing the given data under different evolutionary models.
  • evolutionary models = substitution models lol
  • Chooses the tree that maximizes the likelihood of the observed data given a particular model of sequence evolution/substitution models
  • the highest probability = the best tree
  • the probability is calculates based on the tree topolopgy, the branch lengths of the tree (which represents genetic distance calculated by the substituion/evolutionary models), and the rate parameters of substitution models
  • uses relaxed clock
  • Tree seaching is used to dins the topology with the highest likelihood

Pros:
* Statistically robust; uses explicit models of sequence evolution.
* Handles variable rates of evolution across lineages and sites well.
* High accuracy in tree estimation.
* sophisticated

Limitations:
* slow
* Computationally demanding, especially with large datasets.
* The choice of model can greatly influence the results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
139
Q

Lecture 3 HG: Molecular Phylogenetics

what is tree searching in Maximum likelihood?

A
  • A tree search is used to find the tree top with the highest likelihood.
    Tree Searching
    1. Exhaustive Search:
  • Tries every possible tree. Only feasible with small numbers of taxa.
    2. Hill Climbing:
  • Searches through trees by iterative trial and error.
  • Start with one tree and try, if incorrect, try another tree
  • Doesn’t check all possible trees and isn’t guaranteed to find the optimal one.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
140
Q

Lecture 3 HG: Molecular Phylogenetics

What is Bayesian Inference, how does it work, and its limitations

A
  • optimality and statistical model
  • Similar to ML in using probability models for evolutionary change,
  • BI incorporates prior knowledge and calculates the probability of a tree given the data. It provides a statistical framework for estimating the uncertainty in phylogenetic inferences.
  • Suitable for complex datasets where incorporating prior knowledge or hypothesis testing is important.
  • Ideal when you need to estimate the uncertainty in your phylogenetic inferences.

Pros:
* Incorporates prior knowledge and uncertainty into the analysis.
* Statistically rigorous, using explicit models like ML.
* Provides estimates of the probability of different phylogenetic trees.

Limitations:
* slow
* Even more computationally intensive than ML.
* The choice of priors and models can significantly affect the results.
* Requires careful interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
141
Q

Lecture 3 HG: Molecular Phylogenetics

which phylogenetic model is suitable for what types of scenarios?

A
  • UPGMA: where the molecular clock assumption is valid. for quick, preliminary analyses.
  • NJ: Ideal for large datasets where computational speed is a concern. Useful when the molecular clock assumption is questionable.
  • MP: dealing with well-sampled data where convergent evolution is not a significant issue. Good for analyzing morphological data, not just molecular.
  • ML: for all types, Ideal for analyses where accuracy is more important than computational speed.
  • BI: for complex datasets where incorporating prior knowledge or hypothesis testing is important.
    Ideal when you need to estimate the uncertainty in your phylogenetic inferences. Good for testing evolutionary hypothesis
142
Q

Lecture 3 HG: Molecular Phylogenetics

what is phylogenetic uncertainty and why is it important in phylogenetic analysis

A
  • Most phylogenetic methods (UPGMA, NJ, ML, except BI) provide a single estimate of the ‘true’ tree
  • Does not measure uncertainty of these methods
  • Different parts of a tree (clusters) can be assessed individually for their reliability.
  • “Bootstrapping” is the most common technique. It involves permutation of the original data to create a large number of pseudoreplicate data sets.
143
Q

Lecture 3 HG: Molecular Phylogenetics

What is bootstrapping, and how does it work

A
  • method in statistical analysis, especially used to measure uncertainty and confidence for branches in phylogenetics tree
  • uses resampling technique: repeatedly resampling the dataset with replacement. This means creating many new datasets (called bootstrap samples) by randomly selecting data points from the original dataset, allowing for the same point to be picked more than once.
    1. generating many 100s-1000s of bootstrap samples by sampling with replacement
    2. then generate a phylogenetics tree with each bootstrap sample. This process results in many trees, each slightly different depending on the particular sequences included in the bootstrap sample.
    3. The bootstrap value for a particular branch in the original tree is calculated by determining how often that branch appears in the bootstrap trees. It’s usually expressed as a percentage. For example, if a certain branch of the original tree appears in 75 out of 100 bootstrap trees, it would have a bootstrap value of 75%.
    4. Interpretation: 75% is good, 95% is very robust, and any thing below 50% is rejected
144
Q

Lecture 3 HG: Molecular Phylogenetics

Neutral thoery

A

The neutral theory holds that most variation at the molecular level does not affect fitness and, therefore, the evolutionary fate of genetic variation is best explained by stochastic processes.

145
Q

Lecture 3 HG: Molecular Phylogenetics

why is the molecular clock used even though selection pressure can change rates of evolution

A
  • this is because morphology/phenotype changes is not equal to molecular changes
  • according to neutralist approach: MOST mutations are neutral and hence won;t be affected by mutations. Therefore, the molecular clock fits for majority of the genome, as most of the mutations are pretty constant over time

Limitations
* however not all parts of the genome mutate at same rate (constant rate yes, but not same, for example mitochondrial genes tend to mutate much slower)
* therefore can use relaxed clock to model this

146
Q

Lecture 3 HG: Molecular Phylogenetics

why is molecular clock useful in phylogenetics?

A
  • Molecules can estimate the date of common ancestors for which no fossils are known (filling in gaps in the fossil record).
  • Molecules can estimate divergence dates when there is no obvious morphological change (particularly important for micro-organisms).
147
Q

Lecture 3 HG: Molecular Phylogenetics

Should we use morphology or molecular data for phylogenetics?

A
  • use both!
  • seemingly contradicting to say molecular clock is constant whereas morphological data changes so rapidly in evolution
  • this is because of the integrated explanation of neutralist and selectionist approach
  • neutralist: mutations are mostly neutral and molecular changes are mostly due to random genetic drift
  • selectionist: genetic variations such as mutations become prevalent and fixed in the population due to selection
  • both approaches integrated explain why morphological data changes so rapidly in evolution, whilst molecular mutations are more constant
  • selectionsit explains morphological data, in stating that advantageous mutations impact the morphology very quickly, whereas deleterious ones are eliminated
  • neutralist explains most mutations are neutral, and those that are advantageous will spread out, but these mutations are rarer than average mutation rates
148
Q

Lecture 3 HG: Molecular Phylogenetics

what are modern molecular clocks used for?

A
  1. To understand why some genes/gene regions/species evolve faster than others
  2. To estimate a timescale for phylogenies and evolutionary history
149
Q

Lecture 3 HG: Molecular Phylogenetics

why is the molecular clock not so accurate

A
  • Species genome also evolve at different rates. Depending on size of genome and C-value paradox
  • different parts of genes and different protein have different rates
  • maybe they are constant, but they are dfinetly not the SAME rate
150
Q

Lecture 3 HG: Molecular Phylogenetics

what is the difference between substitution vs Muatation rates

A
  • The substitution/fixation rate is the rate at which sequences in different populations diverge through time.
  • The mutation rate is the rate at which individuals incorporate errors during replication.
    Mutation rate can affect substitution rate
  • The probability of fixation determines the difference between individuals
    k = Nμp = substitution rate (per generation)
  • μ = mutation rate (per individual per generation)
  • p = probability of fixation
  • N = population size (2N for diploids)
151
Q

Lecture 3 HG: Molecular Phylogenetics

What factors affect substitution rates, and which part of the equation does it affect?

A

affecting probability of fixation (p)
1. differences in population size
2. differences in selective pressure

affecting mutation rate per individual per generation (u)
1. differences in generation time
2. differences in metabolic rate
3. differences in efficiency of DNA repair

152
Q

Lecture 3 HG: Molecular Phylogenetics

How can differences in selective pressure effect probability of fixation (p), and ultimately the substitution rate

A
  • variation in constraints for genes: different genes have different fixation rates, some more essential genes have lower fixation rates
  • Sites not under selection (pseudogenes – non-fucntioning genes due to gene duplication etc have higher substitution rate ) evolve fastest
153
Q

Lecture 3 HG: Molecular Phylogenetics

what are the changes to the substitution rate equation, when different selective pressures occur

A

For neutral mutations
* (Ns=0) p = 1/N ,
* k = Nμp = μ

**For strongly advantageous mutations **
* (Ns>1) p ≈ 2s ,
* k = Nμp ≈ 2Nμs

**For strongly disadvantageous mutations **
* (Ns<-1) p ≈ 0 ,
* k = Nμp ≈ 0

s: The selection coefficient, which measures the effect of selection on a particular mutation. It represents the relative fitness advantage (or disadvantage) of a particular genotype compared to the standard genotype. A positive value of s indicates an advantageous mutation,

Ns: The product of the effective population size and the selection coefficient, an indication of the strength of selection relative to the size of the population. It is used to assess the importance of genetic drift versus selection.

154
Q

Lecture 3 HG: Molecular Phylogenetics

dow do differences in population sites effext fixation probability (p) and ultimately substitution rates

A
  • Fixation of mutations depends on the product Ns (size of population and selection coefficient)
  • Mutations are controlled by drift when -1 < Ns < 1.
  • When N is small, slightly deleterious mutations (which are very common) are controlled by drift and can occasionally become fixed.
  • When N is large, these deleterious mutations are controlled by negative selection and never get fixed.
  • Hence substitution rates can increase in smaller populations.
  • But organisms in small populations tend to have long generation times, which may cancel out this effect.
155
Q

Lecture 3 HG: Molecular Phylogenetics

how can differences in generation time effect mutation rate (u) and ultimately effect substitution rates

A
  • μ = mutation rate (per individual per generation)
  • generation time (g) = time between germ line replications, (time between generation1 reproduce and generation 2 reproduce)
  • substitution rates are proportional to generation times. i.e shorter generation times = faster fixation rates
  • generation time is a particularly important factor for selectively neutral polymorphisms (e.g. silent sites, pseudogenes) where no effect of fixation pribability (p)
  • example: substitution rates at silent sites for orangutan, gorilla and chimpanzee are 1.3x, 2.2x and 1.2x than that in humans, which correspond to their proportionally shorter generation times.
  • Not all generations are equal: variation in number of cell divisions (hence more opportunities for mutations per organismal generation).
  • In some species there may be more cell division events in the male germ line than the female, leading to faster Y chromosome evolution than in the X chromosome (e.g. in plants and animals).
156
Q

Lecture 3 HG: Molecular Phylogenetics

what is a simple summary of how fixation probability p can be changed and how it effects fixation rates

A
  • neutral: dependent on mutation rate
    • selection: increases fixation probability
    • selection: close to 0 fixation probability

*small population: higher chance of deleterious fixation
* large population: no chance of deleterious fixation due to negative selectionoutweighs genetic drift
* However small population w long generation times can cancel out effect of drift to fix deleterious mutations

157
Q

Lecture 3 HG: Molecular Phylogenetics

how can differences in metabolic rates effect mutation rate (u) and ultimately effect substitution rates

A

μ = mutation rate (per individual per generation)

  • Smaller bodied vertebrates tend to have higher substitution rates than larger bodied one
  • Could be due to higher basal metabolic rates (BMR) in smaller species)
  • Because the increased oxygen free radicals produced by aerobic respiration which can generate mutations
  • Higher conc of oxygen radicals could also explain why mtDNA genomes tend to evolve faster than nuclear genome
  • Not certain of this yet because too many factors in this,
  • i.e body size and BMR also correlated with generation time, could just be correlation not causation
158
Q

Lecture 3 HG: Molecular Phylogenetics

how can differences in efficiency of DNA repair effect mutation rate (u) and ultimately effect substitution rates

A

μ = mutation rate (per individual per generation)

  • RNA viruses and retroviruses have mutation rates many times higher than those of eukaryotes because they replicate using different polymerases.
  • Highly transcribed genes more efficiently repaired.
  • Sometimes mutation is deliberate and can be adaptive
  • i.e bacterial hypermutator strains or hypermutation during antibody maturation
159
Q

Lecture 3 HG: Molecular Phylogenetics

what is a simple summary of how mutation rate (u = mutation rate per individual per generation time) can be changed and how it effects fixation rates

A
  • generation time short = more opportunity for mutation/faster mutation rate = faster fixation
  • metabolic rate faster = more mutations = faster fixation
  • depending on how good and efficient DNA repair is.
160
Q

Lecture 3 HG: Molecular Phylogenetics

how do we calculate evolutionary rate from genetic distance?

A

Genetic distance = evolutionary rate x (2 x divergence time)

161
Q

Lecture 3 HG: Molecular Phylogenetics

why do we need to know evolutionary rate to be able to use the molecular clock/convert genetic distance into time in years

A

evolutionary rate essentially represents how many mutations per unit time. This allows us to be able to work out evolutionary history in terms of years, and give a timescale.

162
Q

Lecture 3 HG: Molecular Phylogenetics

how can we calculate evolutionary rate?

A

Genetic distance = evolutionary rate x (2 x divergence time)

We know how to obtain genetic distances. If we know at least one divergence time (T) then we can “calibrate” the timescale of the phylogeny using that time.

163
Q

Lecture 3 HG: Molecular Phylogenetics

what are the 4 ways to calibrate a phylogeny/calibrate molecular clocks

A
  1. Fossils
  2. Biogeography
  3. Co-evolution
  4. Measurable evolving populations
164
Q

Lecture 3 HG: Molecular Phylogenetics

what do the nodes distance mean in trees generated by maximum likelihood methods?

A
  • Maximum likelihood methods are used to estimate the phylogeny, with the addition that calibrated nodes are fixed to a timepoint.
  • Branch lengths are then in units of time, not genetic distance.
165
Q

Lecture 3 HG: Molecular Phylogenetics

Calibration using fossil dates

A
  • Point calibration: fix a node in time (due to fossil record) and use this to calibrate
  • Range calibration: when unsure about certain time, and give a range of time
166
Q

Lecture 3 HG: Molecular Phylogenetics

Calibration using Biogeography

A
  • i.e The volcanic origin of the Hawaiian islands has produced a chain of islands of increasing geological age.
  • Hence for example, fruit flies (Drosophila spp.) from the oldest islands form the deepest branch of the tree, and those from the younger islands are phylogenetically more derived.
  • For different genera, the same linear relationship is found between genetic distance and island age
167
Q

Lecture 3 HG: Molecular Phylogenetics

calibration using co-evolution

A
  • If coevolution between 2 species occurred, then timescale for the phylogeny of one can calibrate a phylogeny of the other.
168
Q

Lecture 3 HG: Molecular Phylogenetics

calibration using measurable evolving populations

A
  • Phylogenies can be calibrated using tips (rather than internal nodes) if sequences are from evolutionarily different points in time.
  • **i.e: 1. Ancient DNA **
  • Long periods of time (up to 750K years) between samples.
  • Samples are radiocarbon-dated
    * i.e 2. Rapidly evolving pathogens
  • RNA viruses, small DNA viruses, and some bacteria
  • Evolve very quickly: e.g. 10-3 - 10-5 substitutions per site per year for RNA viruses
  • Allow the generation of ‘distance trees’ and create a phylogenetic tree calibrated by time, using samples sequences from a species/individual at different times
169
Q

Lecture 3 HG: Molecular Phylogenetics

what is equation for fixation rate

A
  • Substitution rate = population size * mutation rate * probability of fixation (PER GENERATION PER INDIVIDUAL)
170
Q

Lecture 3 HG: Molecular Phylogenetics

difference between substitution rate and mutation rate

A
  • Substitution rate is different to mutation rate, substitution rate is the rate a gene is FIXED in the population/not removed
171
Q

Lecture 3 HG: Molecular Phylogenetics

what are molecular clocks used for

A
  • Molecular clocks used to understand why some genes/gene regions/species evolve/fixed faster than others;
  • and form a timescale for phylogenetic trees
172
Q

MT HG 5/6: Population Genetics

what does population genetics focus on?

A

Genetic diversity, Evolutionary forces, Population structure

173
Q

MT HG 5/6: Population Genetics

what is the definition of population genetics

A

Population genetics is the study of the distributions and changes of allele frequency in a population, as the population is subject to the four main evolutionary processes: natural selection, genetic drift, mutation, and gene flow. It also takes into account factors like recombination and population structure.

  • seeks to understand diversity or variation, including both genotypic and phenotypic diversity
  • models diversity using mathematical models along with empirical data (molecular markers, phenotype, environmental)
174
Q

MT HG 5/6: Population Genetics

what does understanding pop gen allow us to understand more?

A
  • Population genetics ultimately underpin all phenomena in evolutionary biology and
  • essential tool in diverse areas of investigation and application
  • (e.g. conservation, breeding, ecology, phylogeny, evolutionary history, interactions)
175
Q

MT HG 5/6: Population Genetics

why is studing genetic diversity related to pop gen studies

A
  • The study of genetic diversity in biological populations (mostly within species) and of the processes that cause genetic diversity to change
  • Genetic diversity is synonymous with intra-specific diversity
  • Inter-specific diversity refers to differences between species, mostly involves other processes and models NOT IN POP GEN
176
Q

MT HG 5/6: Population Genetics

what is genetic diversity

A
  • the total number of genetic characteristics in the genetic makeup of a species.
  • It is the variability in the genes among individuals within a population, among populations within a specie.
  • This diversity is what enables populations to adapt to changing environments and is a fundamental component of biodiversity.
  • High genetic diversity usually indicates a healthy and resilient population, capable of surviving changes and challenges in their environment.
  • genetic diversity of a species = intra-specific diversity
177
Q

MT HG 5/6: Population Genetics

what is relevant about evolutionary forces in studying pop gen

A
  • evolutionary forces like: natural selection, genetic drift, mutations, gene flow is used to explain some types of pop gen
  • explains shifts in equillibriums and some appearances of certain population structures
178
Q

MT HG 5/6: Population Genetics

what created/started the modern synthesis of population genetics

A
  • integration of Mendelian genetics and Darwinian natural selection.
  • Key figures were RA Fisher, JBS Haldane, and Sewell Wright
  • RA Fisher: The Genetical Theory of Natural Selection,. Fisher’s work helped to reconcile Mendelian genetics with Darwinian evolution, particularly through his explanation of how genetic variation is maintained in a population through balancing selections, overdominance (heterozygosity) ..etc
  • JBS Haldane: explored the rate at which favorable mutations can spread through a population, and providing insights into other evolutionary processes such as genetic drift
  • Wright: explored inbreeding and genetic drift and how genotypes are linked to reproductive success
179
Q

MT HG 5/6: Population Genetics

how did discovery of DNA and other technologies help pop gen research?

A
  • Discovery of DNA has helped to provide better understanding/quantitative understanding of pop gen
  • more investigations on many more species
  • Can sequence genomes to bring new understandings to pop gen
  • More species to research, research across species quickly
  • giving quatitative methods and models to test and verify hypothesis.
  • can sequence DNA and label DNA to test certain hypothesis
  • do more quantitative research and have more quantitative empirical data
180
Q

MT HG 5/6: Population Genetics

what is one thing to keep in mind when researching pop gen that differs from lectures taught on pop gene

A
  • lecture focus on the effect on pop gene and variation from a single gene
  • yet take in mind that many traits are controlled by multiple genes (multigenic)
181
Q

MT HG 5/6: Population Genetics

what are the types of genetic variation

A
  • Single Nucleotide Polymorphisms (SNPs):
  • Insertions and Deletions (Indels):
  • Proteins: blood groups (1900),
  • allozymes (electrophoretically-distinct proteins; 1966)
  • Structural variations (gene duplications/losses, chromosomal arrangements)
  • Variable Number Tandem Repeats (VNTRs):
182
Q

MT HG 5/6: Population Genetics

what is the link between genetic variation and genetic markers

A
  • genetic markers are a subset of genetic variations
  • genetic markers: they are specific sequences that have been identified as useful for a particular purpose, like mapping the genome, studying population genetics, or forensic analysis.
  • genetic variation: diversity of gene frequencies in a population
183
Q

MT HG 5/6: Population Genetics

what are types of genetic markers

A
  1. Proteins: blood groups (1900),
  2. allozymes (electrophoretically-distinct proteins; 1966)
  3. VNTRs
  4. SNPs (VERY USEFUL)
184
Q

MT HG 5/6: Population Genetics

what are SNPs and why is it useful in studying population genetics

A
  • the most common types of genetic variation among people. A SNP is a variation at a single position in a DNA sequence among individuals.

they are useful because:
* High Abundance:
* Tracking Inheritance and Evolution
* Population Structure and Diversity
* Adaptation and Selection

**SNPs and variation within species is important, hence also why DNA sequencing of one individual may not tell you everything about that species **

185
Q

MT HG 5/6: Population Genetics

what is genotype?

A
  1. The genotype is the allelic make up on an individual
  2. the genetic makeup of an individual
  3. whether they are homozygous or heterozygous for certain gene(s)
186
Q

MT HG 5/6: Population Genetics

heterozygos and homozygous definition

A

Diploid organisms are heterozygous if they carry different alleles at a genetic locus

187
Q

MT HG 5/6: Population Genetics

what is polymorphic?

A
  1. At population level: polymorphic, more than one allele in a population (typically > 1- 5%)
    a) polymorphic site - when 2 alleles are detected at a locus
188
Q

MT HG 5/6: Population Genetics

How to calculate/measure genetic variation using average pairwise difference per site for this example:

  • Aim to detect polymorphism within 6 sequences
A
  1. There are n=6 sequences, each L=500 nucleotides long
  2. Number of nucleotide sites with genetic differences (S) = 4
  3. Number of distinct sequences = 5
    a) Proportion of variable (segregating) sites = S/L = 4/500 = 0.008
    b) Average pairwise difference (PD) = 2.0
    c) Average PD per site = PD/L = 0.004
189
Q

MT HG 5/6: Population Genetics

how do I calculate pairwise difference for this question

A

The average pairwise difference (PD) is the average number of nucleotide differences per comparison between two sequences. It is obtained by comparing each sequence with every other sequence and then averaging the number of differences.

  • Compare each sequence to every other sequence.
  • Count the number of differences for each comparison.
  • Sum all the differences.
  • Divide by the number of comparisons made.
190
Q

MT HG 5/6: Population Genetics

why and when would we use pairwise distance calculations to measure/account for genetic variations

A
  • Measuring Genetic Diversity: Pairwise distances can be used to measure the genetic diversity within a population. High average pairwise distances indicate high genetic variation, while low pairwise distances suggest genetic homogeneity.
  • Phylogenetic Analysis: When constructing phylogenetic trees, pairwise distances between sequences can be used to infer evolutionary relationships.
  • Population Structure: For example, if certain groups within a population have smaller pairwise distances amongst themselves than with other groups, this suggests a structured population with potential subpopulations.
  • Identifying subpopulations and gene flow: If the genetic distance is significantly larger between populations than within them, it might indicate limited interbreeding or historical separation.
  • Epidemiology: tracking evolution and outbreak of disease. Observing when huge genetic diversity occured. When the pathogen evolved fastest
191
Q

MT HG 5/6: Population Genetics

what is how how to measure heterozygosity (h)

A
  • Heterozygosity (h) is the fraction of individuals in a population that are expected to be heterozygous (normally for one gene/locus, but can be used to calculate average heterozygosity, which involves averaging across multiple genes/loci)
  • useful measure of genetic variation, even when there are many alleles
  • Individual allele frequencies is less convenient, especially when there are multiple alleles - heterozygosity presented in a fraction can be more easily compared and gives more information about the genetic diversity of a population rather than knowing specific allele frequencies for each allele
  • the ratio/fraction: better comparison and better picture of genetic variation in population
192
Q

MT HG 5/6: Population Genetics

What is average heterozygosity (H)

A
  1. Average heterozygosity (H), is obtained by averaging h across many loci/many genes; it represents the proportion of loci observed to be heterozygous in an average individual.
193
Q

how to calculate probability that a single locus (locus i) is heterozygous

A
  • h is probability that any two alleles randomly sampled from the population are different.
  • It is greatest when there are many alleles, all at equal frequency
  • m = number of alleles for this gene
  • i = allele i
  • This formula subtracts the sum of the squared allele frequencies (which represents the probability of homozygosity) from 1 to give the probability of heterozygosity.
194
Q

MT HG 5/6: Population Genetics

how to calculate average heterozygosity (H) for multiple loci, what is the equation?

A
  • first find each of the average heterozygosity (h) for n numbers of genes/loci (from locus 1 to locus n)
  • then sum them all up, and divided by the number of loci n
  • this gives the the average heterozygosity (fraction of heterozygotes) for multiple alleles (averaged)
195
Q

MT HG 5/6: Population Genetics

what is Hardy-Weinberg Equillibrium, and what does it look at?

A
  • a mathematical model for understanding allele and genotype frequencies in a population, if the population were not affected by evolutionary forces (if the population were in a state of equilibrium),
  • this includes not looking at the effects of random genetic drift, as well as mutation, migration (gene flow), and non-random mating
  • HW Principle is an example of a null-model
  • Predicts genotype frequencies based on allele frequencies, when stable across generations in a stable population.
  • looks at population stability
    * the equillibirum is the predicted fractions of homozygotes and heterozygotes present in population predicted by using hardy weinberg equation
196
Q

MT HG 5/6: Population Genetics

What are the assumptions of HW equation/principle and why?

A
  1. Diploid organism with sexual reproduction (random and independent chromosome transmission to offspring): ensures inheritance and sexual reproduction giving homo and heterozygous individuals occur
  2. Non-overlapping generations:
  3. Infinite population size (no random genetic drift)
  4. Radom mating (no selection)
  5. Males and females have equal allele frequencies
  6. A close population (no migration/no gene flow)
  7. No mutation (no evolutionary forces)
  8. No natural selection effects
197
Q

MT HG 5/6: Population Genetics

what does Hardy-Weinberg Equilliborum explain/show

A
  • the Hardy-Weinberg equilibrium provides a fundamental null hypothesis for population genetics, showing that sexual reproduction alone does not alter the genetic makeup of a population.
  • It is the interplay of genetic inheritance with evolutionary forces that shapes the genetic structure of populations over time.
  • Inheritance alone can maintain genetic diversity, hence biological variation is not lost through time
198
Q

MT HG 5/6: Population Genetics

what is the Hardy Weinberg Equations

A

ii. The principle extends to >2 alleles and to multiple loci that segregate independently

199
Q

MT HG 5/6: Population Genetics

what doe sthe hardy weinberg EQUATION show?

A

It shows that in absence of evolutionary forces:
1. Genotype frequencies are in equilibrium, i.e. they remain unchanged indefinitely, as long as nothing happens to population
2. This equilibrium is reached quickly,
a) after only one generation of random mating, regardless of the frequencies in the parental generation
3. If genotype frequencies are different from those predicted, then at least one evolutionary force is acting
a) Population is not in equilibrium, some type of evolutionary force is present
4. When observed ratios are different to expected, important to look at the evolutionary forces

200
Q

MT HG 5/6: Population Genetics

how to calculate HW when not in equillibirum

A
201
Q

MT HG 5/6: Population Genetics

How and what evolutionary forces act in population genetics

A
  • while there are several evolutionary forces that shape the genetic makeup of a population, they operate through different mechanisms, yet their outcomes can influence and interact with the outcomes of natural selection.

* Molecular Level
* Mutation
* Recombination

  • Population Level
  • Non-random mating (inbreeding)
  • Random Genetic Drift
  • Migration/gene flow

* Natural Selection or Adaptation

202
Q

MT HG 5/6: Population Genetics

what is Linkage Disequillibirum and recombination

A

Linkage Disequillibrium:
* Definition: LD refers to the non-random association of alleles at two or more loci. It means that the combination of alleles on a chromosome occurs together more or less often than would be expected by chance if the loci were segregating independently.

  • Causes: LD can be caused by several factors physical proximity of genes on a chromosome (they are ‘linked’), reduced recombination (i.e at centromeres or in X/Y chromosomes), genetic drift…etc

Recombination
* Definition: Recombination is the process by which pieces of DNA are broken and rejoined, which leads to new combinations of alleles. This occurs during meiosis, the cell division process that leads to the production of gametes (sperm and eggs).

  • Mechanism: During meiosis, homologous chromosomes pair up and exchange segments in a process known as crossing over. This shuffles the alleles and creates new combinations on each chromosome, which increases genetic diversity.
203
Q

MT HG 5/6: Population Genetics

What are the implications of recombination on linkage disequillibrium

A
  • Impact on LD: Recombination tends to break down LD over time because it creates new allele combinations. The more frequently recombination occurs between two loci, the less likely those loci are to show LD. Conversely, loci that are close together on a chromosome are less likely to be separated by recombination and are therefore more likely to exhibit LD.
  • LD decreases due to random re-assortment of genes (alleles) resulting from recombination
  • However, LD may be preserved due to selection, this is referred to as a selective sweep/hitchhiking or Background Selection (deleterious mutations + their closely linked genes are removed tgt) or Epistatic Selection
204
Q

MT HG 5/6: Population Genetics

what is Non-Random Mating/Inbreeding

A
  • Inbreeding: individuals mate with relatives more often than by chance
    1. Affects all genes of an organism, increases homozygosity
    2. May be adaptive: e.g. self-fertilisation in plants improves survival of isolated individuals
  • Positive assortative mating: occurs among individuals with similar phenotype
    1. Affects a subset of genes (linked to phenotype)
    2. Increase homozygosity decrease heterozygosity
205
Q

MT HG 5/6: Population Genetics

what is the effect of inbreeding on population genetics

A
  • Inbreeding and positive assortative mating do not change allele frequencies but they increase homozygosity (vs H-W).
    1. Homozygosity increases because offspring are more likely to inherit the same allele from both parents
    2. This is known as Identity by Descent (IBD)
206
Q

MT HG 5/6: Population Genetics

What is the Inbreeding Co-efficient and how to calculate it?

A

Inbreeding Coefficient (F): is a measure of the likelihood that an individual has received two identical alleles of a gene from an ancestor common to both its parents. AKA: the number of genes that are possibly the same 2 alleles (homozygous)

F=0: This indicates random mating, with no preference or aversion to mating with relatives, which is in line with the expectations of the Hardy-Weinberg equilibrium. There’s no inbreeding, so the observed heterozygosity = expected heterozygosity

F=1: This represents complete inbreeding, such as self-fertilization in plants or incestuous mating in animals, where an individual’s two alleles at every locus are identical by descent. This scenario predicts no heterozygotes in the population because every individual is homozygous for every gene, having inherited the exact same allele from both parents.

F=0.25: This value suggests the level of inbreeding that would be expected between full siblings, meaning they share both parents. The value 0.25 comes from the fact that there is a 25% probability that any given allele from one sibling will be identical by descent to an allele in the same locus in the other sibling. This is because each parent has a 50% chance of passing on either one of their two alleles, and when you combine the probabilities from both parents, you get 0.5 * 0.5 = 0.25

Calculating Inbreeding coefficient (F): is used to measure the level of recent inbreeding
1. F = 0, random mating (H-W)
2. F = 1, complete inbreeding, no heterozygotes
3. F = 0.25, for full-siblingss (same two parents)

207
Q

MT HG 5/6: Population Genetics

what is inbreeding depression?

A

where individuals that are the result of inbreeding exhibit reduced biological fitness.

Inbreeding involves the breeding of closely related individuals and often leads to an increase in homozygosity, which can bring recessive alleles, including deleterious or harmful ones, together.

result in an increased expression of harmful genetic traits and a decrease in the overall genetic diversity within a population.

208
Q

MT HG 5/6: Population Genetics

what is identity by descent

A
  • when inbreeding and positive assortment mating causes homozygosity to appear more in the population, without changing allele frequency
  • homozygosity increases because there is an increased chance that the offspring inherits the same allele from both parents (more likely that parents, as they are related, have the same allele)
209
Q

MT HG 5/6: Population Genetics

what are the 2 consequences of inbreeding, what is an example of this

A

i. Reduced fitness: often results from homozygosity
in recessive deleterious alleles
ii. Any further inbreeding Leads to decline in populations/lowered survival rates

Example: Australian Shepherds
- inbreeding caused a decreased lifespan abd increase in percentage mortality as inbreeding increased over time. This eventually leads to decline in population

210
Q

MT HG 5/6: Population Genetics

what is genetic drift?

A
  • a random process and mechanism of evolution, where changes in allele frequencies in a population over time is random. Over time, random changes in alllele frequence causes egnetic variations = genetic drft
  • occurs after founder’s effect or bottleneck effect, (i.e after epidemic or migration event)
  • leads to the fixation or elimination of certain alleles by chance, especially more significant in small populations
  • agrees with the neutralist theory of evolution: where molecular evolutionary changes are mostly random.
211
Q

MT HG 5/6: Population Genetics

what are 2 key concepts that are related to genetic drift?

A
  1. founder effect: This occurs when a new population is established by a very small number of individuals from a larger population. This small group may have different allele frequencies than the original population, leading to a shift in allele frequencies in the new population.
  2. Bottleneck Effect: This happens when a large population is drastically and randomly reduced in size due to an event like a natural disaster.
212
Q

MT HG 5/6: Population Genetics

what are outcomes of genetic drift?

A
  • Drift causes substantive changes in allele frequencies
    *Increased homozygosity is a typical outcome of drift
213
Q

MT HG 5/6: Population Genetics

what is fixation in simple terms?

A

Fixation is when an allele’s frequency reaches 100% in the population (homozygosity)

214
Q

MT HG 5/6: Population Genetics

what can cause founder’s effect?

A
  • A founder effect is observed as a result of genetic drift and inbreeding in a subpopulation
  • founder’s effect by genetic drift, often causes a small new population, where inbreeding occurs
  • this leads to increase in homozygosity
  • causes: high occurences of rare diseases in certain human populations; or in some wild felid populations have very low heterozygosity
215
Q

MT HG 5/6: Population Genetics

what is effective population size (Ne)? What difference is it from population size (N)

A
  • effective population size (Ne) is a theoretical concept to represent the genetic drift and reproduction of the actual population
  • Ne is used in place of census population size, because the census (actual) population size doesn’t accurately reflect genetic drift’s effect
  • It is meant to represent the population which actually contributes to genetic drift/reproduction from the census population
  • Ne is often smaller than N
  • specific to population genetics, where we are interested in reproduction in population
216
Q

MT HG 5/6: Population Genetics

why is effective population size (Ne) used instead of N

A
  • specific to population genetics, where we are interested in reproduction in population
  • Because when trying to understand/calculate effects and predict genetic drift, using the actual population size (N) cannot accurately reflect the genetic drift. This is because not all individuals in all generations have an equal propensity to reproduce
  • This is because:
    1. some of the population won’t contribute to reproduction, and do not have reproductive success, hence doesn’t contribute to genetic drift
    2. where sex ratio is imbalanced, therefore not all individuals contribute evenly to reproduction
    3. age structure: not all can contribute to reproduction
    4. Fluctuating Population Size: Populations that undergo frequent changes in size (e.g., due to seasonal variations, natural disasters, etc.) often have a smaller effective population size compared to their average census size over time.
  • Ne is though to represent the actual population which contributes towards reproduction and hence genetic drift
  • The amount of (neutral) diversity in a population will depend on Ne, not N
217
Q

MT HG 5/6: Population Genetics

what is effective population size Ne used for and why?

A
  • after calculating Ne (which depends on various factors and scenarios, hence uses different calculations):
    1. Ne can be used to predict rates of genetic drift. By knowing Ne, we can estimate how quickly genetic variability might be lost due to drift.
    2. Ne is crucial for estimating the rate of inbreeding in small populations, and how long it takes for harmful mutations to spread in population with inbreeding
    3. Ne can be used in conservation biology, by determining minimum viable population size, and come up with conservation strategies for endagered species. i.e outcrossing, and captive breeding (to prevent inbreeding depression in captive breeding).
    4. Ne helps in identifying past population bottlenecks and founder effects by comparing it to the actual population size. Significant discrepancies can indicate historical events that have reduced genetic diversity.
218
Q

MT HG 5/6: Population Genetics

why is Ne often smaller than N

A
  1. Because some individuals don’t contribute reproductively
219
Q

MT HG 5/6: Population Genetics

what are some ways to calculate Ne given different population structures?

A
  • Many approaches have been developed to calculate Ne to account for the differ factors that affect population size:
  1. Scenario: population size fluctures. If a population size varies through time (t), Ne is the harmonic mean of population size; it is calculated as follows:
  2. Scenario: unequal sex ratio. The rarer sex will contribute more offspring per capita that the common one
Population fluctuates equation (top); unequal sex ratio (bottom)
220
Q

MT HG 5/6: Population Genetics

what is demes

A

demes = sub-populations: distinct groups within a larger population that often exhibit some degree of separation or distinctiveness, either geographically, genetically, ecologically or behaviorally.

  • still have gene flow (or migration) and inbreeding potential between sub-populations. Not completely isolated
221
Q

MT HG 5/6: Population Genetics

what is gene flow/migration

A
  • Gene flow, aka genetic migration, is the transfer of alleles from one population to another. It occurs when individuals from one population migrate to another and breed there.
  • introduces new alleles and increases genetic diversity of a sub-population, but decreases overall genetic differences between the 2 groups.
  • Leads to genetic homogeneity between the 2 populations, reduces overal heterozygosity (compared to HW equillibrium)
222
Q

MT HG 5/6: Population Genetics

what is the island model, what does it suggest. Please explain

A
  • island model used to illustrate migration/gene flow
  • Migration can increase diversity within sub-populations but decrease diversity among them
  • Migration can result in a reduction in overall heterozygosity (compared to H-W)
  • Over time, with sufficient migration between subpopulations (like an “island migration model,” where multiple subpopulations or “islands” exchange individuals at equal rates), the allele frequencies in all subpopulations would be expected to converge to a common/mean frequency. This means the genetic differences between the subpopulations would decrease due to the mixing of their gene pools.
island migration model
223
Q

MT HG 5/6: Population Genetics

how would we test and calculate to see if sub-population migrations and gene flows would lead to a decrease in heterozygosity?

A

This equation: (2pAqA + 2pBqB)/2
finds the average heterozygosity of 2 sub-populations under H-W equillibrium, qhere pA, qA are different allele frequencies from sub-pop A, and pB, qB are diff allele freqs from sub-pop B
* this average heterozygositywill always be lower in this than an equivalent mixed population. This is because the mixing of populations allows for more combinations of alleles and therefore a higher likelihood of heterozygotes.
* ultimately over time, the allele frequencies will average out and converge between the 2 demes

224
Q

MT HG 5/6: Population Genetics

what is the fixation index/Fst

A
  1. The Fixation Index (FST) is the fraction of total genetic diversity that is due to differences among demes (sub-populations)
    i. FST = 1, complete divergence among demes, i.e. no migration
    ii. FST = 0, no divergence among demes, i.e. frequent migration, complete mix of gene pool, genetically identical
225
Q

MT HG 5/6: Population Genetics

how do we calculate Fst using heterozygosity?

A
  • We can calculate FST based on heterozygosity levels
  • HT: is the expected heterozygosity of the total population (considering all subpopulations as one population).
  • HS: is the average heterozygosity within subpopulations, and then calculating average.
  1. to calculate HT= treat all sub-populations as 1 population, and calculate overall allele frequency for each allele.
  2. HT-HS= gives you genetic difference of expected heterozygosity if it was all mixed into 1 population vs actual average heterozygosity across all demes
226
Q

MT HG 5/6: Population Genetics

what are 2 ways to calculate Fst

A
  1. We can calculate FST based on heterozygosity levels
  2. We can use FST to calculate the rate of migration (Nm) between demes
227
Q

MT HG 5/6: Population Genetics

How do we calculate Fst using rate of migration (Nm)

A

FST = 1 / (4Nm + 1)
* vice versa: if we know FST we can then use it to resolve Nm

228
Q

MT HG 5/6: Population Genetics

what are some things that Fst can tell us?

A
  • Genetic Differentiation: tells us about difference in gene flow, and migration. Can be used to explore speciation events or ecological changes
  • Historical Events: if Fst/genetic difference is low, genetic drift or founders/bottleneck occured.
229
Q

MT HG 5/6: Population Genetics

how to study population structures in the wild?

A
  1. Clustering algorithms can identify sub-groups within a species using genetic marker data from multiple loci
  2. The best known of these programs is “STRUCTURE” (Pritchard et al. 2000)
  3. Use of STRUCTURE is illustrated in a study of the endangered maple, Acer miaotaiense
230
Q

MT HG 5/6: Population Genetics

summarise in table how evolutionary forces impact genetic variation? (within pop-variation and among-pop variation)

A
231
Q

MT HG 5/6: Population Genetics

in simple terms, what is population genetics?

A

Population genetics is the study of genetic diversity in biological populations of a species, and the processes which can cause change in genetic diversity.

Discovery of molecular methods have allowed us to better trach and understand population genetics in species.

Population genetics helps understand the fundamentals of evolutionary changes.

Population genetics aims to understand diversity of phenotypes and genotypes using mathematical models and molecular data

232
Q

MT HG 5/6: Population Genetics

how does natural selection affect genetic diversity and population genetics?

A

Natural selection: affects diversity by directly affecting phenotype diversity and hence causing genetic change

233
Q

MT HG 5/6: Population Genetics

what do quantitative methods, and genetic markers do in population genetics?

A
  • Quantitative genetics uses numerical models and statistics to understand genetic changes on phenotypes (looking more at multigenic traits).
  • Genetic markers such as SNPs or other specific regions of the genome is used to measure genetic diversity in populations.
  • Polymorphic sites and SNPs in populations is one way to measure variation in genetic diversity in a population, and counting the proportion of variable/segregating/polymorphic sites, to measure average pairwise difference
234
Q

MT HG 5/6: Population Genetics

if population heterozygosity doesn’t match with HW equillibrium, what does this suggest?

A

If the equation doesn’t meet the expected, then other evolutionary forces are acting

235
Q

MT HG 5/6: Population Genetics

what level does natural selection act on?

A
  • natural selection operates and acts on individual level, but it’s effects influence and can be observed on molecular and species level
  • Individual Level: The process of natural selection operates on the phenotypes of individuals, not directly on their genes
  • molecular level: depending on the phenotype it favors, it will effect the chances the corresponding gene/genes are passed on, and whether over time it becomes fixed (or if allele freq changes)
  • species level: over time, effects of natural selection and phenotypic traits can cause species/population level effects like speciation
236
Q

MT HG 5/6: Population Genetics

what are the key points which effect genetic diversity in population genetics?

A
  • Molecular level:
    Mutations:
    Recombination and genetic linkage:
  • Populations level:
    Non-random mating (inbreeding):
    Random genetic drift:
    Founder’s effect:
    Effective population size vs census population size
    Migration and Island models:
    Fixation Index:
  • Natural selection:
237
Q

MT HG 5/6: Population Genetics

How does population structure influence our understanding of population genetics and evolutionary forces?

A
  • population structure can influence the reproductive success, and hence genetic drift
  • therefore Ne effective population size needs to be used instead of census population size, in certain population structures such as unequal sex ratio, or fluctuating population sizes
  • migration and gene flow (such as the island model population structure) also effects genetic diversity -> which ultimately effects population genetics such as heterozygosity and allele frequencies
  • FST is used to both infer population heterzygosity, population genetic diversity, as well as migration rate (Nm) between demes
238
Q

MT HG 5/6: Population Genetics

How does natural selection occur? What are the essential factors for it to act on?

A
  • organisms within a species/population differ in their ability to survive and reproduce, due to their difference in genotypes
  • fitness = ability to survive and reproduce
  • those alleles which enhance survival and reproduction ill contribute disproportionally more to the next generation’s gene pool, hence changing allele frequencies over time

Selection acts on the whole organism/individual

239
Q

MT HG 5/6: Population Genetics

what does selection act on

A

individual level acts on phenotypes that are associated with fitness and indirectly on genotype (allele frequencies).

240
Q

MT HG 5/6: Population Genetics

what are the key features of natural selection

A
  • variation in genotype (mutations exist in population, individuals will have differing phenotypes (and corresponding different genotypes)
  • Heritability: phenotypes can be herited by their corresponding genotypes/alleles
  • differential survival rates: alleles for certain phenotypes have differentiating survival/reproductive advatanges
  • change in allele frequency overtime
  • non-directional process, which depends on envrionment, and in ongoing
241
Q

MT HG 5/6: Population Genetics

how can natural selection lead to fixation of alleles

A
  • over time, under constant environmental selection pressure, positive selection for beneficial allele, will lead to fixation of that allele in population.
  • because that adv allele will increase in allele freq in gene pool until it reaches 1
242
Q

MT HG 5/6: Population Genetics

key terminology: focus of population genetics explores…

A
  • to understand genotypic and phenotypic variation
  • models diversity using mathematical models along with empirical data (molecular markers, phenotype, environmental)
243
Q

MT HG 5/6: Population Genetics

key terminology: natural selection

A
  • affects diversity by acting on phenotype and thus indirectly results genetic change
  • Natural (and artificial) selection specifically acts on the heritable (genetic) component of phenotypic variation
  • Phenotypes controlled by a single locus (gene) are useful to understand natural selection but in nature many phenotypes are under multigenic control
244
Q

MT HG 5/6: Population Genetics

what is a con about how we study population genetics and evolution

A
  • Phenotypes controlled by a single locus (gene) are useful to understand natural selection but in nature many phenotypes are under multigenic control
245
Q

MT HG 5/6: Population Genetics

key terminology: what is quantitative genetics, what does it explore?

A
  • seeks to understand the heritable basis of phenotypic variation
  • It uses numerical models and statistical analyses to measure genetic effects on phenotype
  • ** phenotypic data alone or along with genomic data **
246
Q

MT HG 5/6: Population Genetics

what is anotehr word for phenotype

A

traits

247
Q

MT HG 5/6: Population Genetics

what are models used for single locus or multi locus selection?

A
  • single locus population genetics
  • quantitative genetics - multiple locus controlling traits
248
Q

MT HG 5/6: Population Genetics

for single locus population genetics models, what type of selections are resulted

A
  • positive selection
  • negative selection
  • balancing selection - favors co-existence of both allales in population
249
Q

MT HG 5/6: Population Genetics

for multilocus quantitative genetics models, what types of selections are resulted

A
  1. directional selection (average trait value increases or decreases)
  2. disruptive selection (extreme trait values selected for)
  3. stabilising selection (average trait value selected for)
250
Q

MT HG 5/6: Population Genetics

what does directional selection include

A
  • both negative and positive selection are directional selection
251
Q

MT HG 5/6: Population Genetics

what does balancing selection include

A
  • both disruptive and stabilising selection are balancing selection
252
Q

MT HG 5/6: Population Genetics

define fitness

A
  • the reproductive success of an individual with a particular genotype or phenotype.
  • a measure of an individual’s genetic contribution to the next generation compared to other genotypes or phenotypes within the population
253
Q

MT HG 5/6: Population Genetics

what is relative fitness?

A
  • Relative fitness is the average number of offspring produced by the individuals with a particular genotype compared with other genotypes
254
Q

MT HG 5/6: Population Genetics

what are some components of fitness?

A

survival to reproductive age, mating success, and fecundity (the number of offspring produced).

255
Q

MT HG 5/6: Population Genetics

what is selection coefficient

A
  • fitness of an allele is expressed as a selection coefficient (s)
  • represents the increase or decrease in fitness of that allele compared to others
  • The selection coefficient (s) is related to fitness and is a measure of the strength of natural selection against a genotype
256
Q

MT HG 5/6: Population Genetics

how to calculate selection coefficient

A
  • i.e Given the fitness of an allele is 1.0 (e.g. our “reference allele”)
  • The value of s is the range: -1 < s < 1
  • How do we interpret the value of s?
  • if s= + 0.30, means a new allele has 30% greater fitness than the previous or reference allele
  • s = - 0.60, means a new allele has 60% lower fitness than the previous or reference allele
  • s = 0, means a new allele has the same fitness than the previous or reference allele
257
Q

MT HG 5/6: Population Genetics

what is direct fitness

A
  • Direct fitness is the component of fitness from an individual’s own offspring.
  • It measures the number of offspring an individual produces until they are capable of reproducing themselves.
258
Q

MT HG 5/6: Population Genetics

how to calculate direct fitness

A

Direct fitness can be calculated by counting the number of offspring an individual produces that survive to reproductive age.

259
Q

MT HG 5/6: Population Genetics

what is Indirect Fitness:

A
  • The component of an individual’s genetic success due to the reproduction of its non-descendant relatives.
  • an individual can contribute genes to the next generation not only by producing its own offspring but also by aiding relatives, which share some proportion of their genes.
  • kin selection, altruism
260
Q

MT HG 5/6: Population Genetics

How to calculate indriect fitness?

A
  • Indirect fitness is typically calculated by measuring the reproductive success of the individual’s relatives, multiplied by the degree of relatedness to those relatives.
  • For example, if an individual helps a sibling (with whom they share, on average, 50% of their genes) to produce extra offspring, the indirect fitness gain would be those additional offspring multiplied by 0.5.
261
Q

MT HG 5/6: Population Genetics

what is inclusive fitness?

A
  • Inclusive fitness combines direct and indirect fitness. It represents an individual’s total genetic contribution to the next generation, including genes passed on directly as well as genes passed on by relatives due in part to the individual’s assistance or influence.
262
Q

MT HG 5/6: Population Genetics

how to calculate inclusive fitness?

A

um of direct fitness and weighted indirect fitness contributions. The weights are the coefficients of relatedness between the individual and the relatives they support.

263
Q

MT HG 5/6: Population Genetics

why is it important to understand inclusive fitness

A

because then we can understand hamilton’s rule of kin selection
when C <B * r
where c = cost, B = benefits to recipient and r is relatedness

264
Q

MT HG 5/6: Population Genetics

what happens to change in allele frequency in haploids vs diploids. Which one is faster why?

A
  • Changes in allele in frequency (∆q) can occur more rapidly in haploids (n) than diploids (2n)
  • This is because the relationship between genotype and phenotype is simpler in haploids
  • Diploids can be heterozygote and ‘hide’ some allele’s traits
265
Q

MT HG 5/6: Population Genetics

what is the change in allele frequence to next generation expressed as?

A
  • ∆q = spq
  • p, q are the frequencies of the alleles P and Q, respectively
  • The Q allele has a fitness of 1+s
  • ∆q is the change in q from one generation to the next
266
Q

MT HG 5/6: Population Genetics

In haploid allele frequence changes equation, what are some possible outcomes of the equation and why?

A
  • ∆q = spq
  • q increases when s is positive (fitness is higher than average), and q decreases when s is negative (fitness is below average for this allele)
  • Allele frequency change is slower when either P or Q are extreme, and fastest when p = q = 0.5
267
Q

MT HG 5/6: Population Genetics

why is it for the haploid allele frequency change equation:
when Allele frequency change is slower when either P or Q are extreme, and fastest when p = q = 0.5

A
  • When p and q are both at 0.5, it means both alleles are equally frequent in the population. - Under such conditions, the product pq will be at its maximum (since 0.5 x 0.5 = 0.25), leading to the fastest change in allele frequency as per the equation ∆q = spq.
  • Conversely, when either p or q is near 0 or 1 (meaning one allele is very rare or very common), the product pq will be closer to 0, resulting in a slower change in allele frequency.

(p+q add to 1, so if p is 0.7 and q is 0.3 - more extreme values, then 07*0.3 is less than 0.25, only when p q are at 0.5, it is largest 0.25)

268
Q

MT HG 5/6: Population Genetics

when plotting allele frequency against time for haploids, what does the graph look like

A

sigmoidal curve,

269
Q

MT HG 5/6: Population Genetics

what is an example of fast allele frequency changes/fast selection in haploids

A

rapid selection and fixation influenza virus

270
Q

MT HG 5/6: Population Genetics

in diploids, what is fitness influenced by

A
  • In diploids, fitness is influenced by the degree of dominance (h) of an allele, as follows:
  • PP = 1 ;
  • PQ = 1 + hs ;
  • QQ = 1 + s
  • H ranges from 0 to 1
271
Q

MT HG 5/6: Population Genetics

the degree of dominance of an allele (h)

A
272
Q

MT HG 5/6: Population Genetics

what are the equations denoting for diploid fitness? What do they mean?

A
  • h = degree of dominance (0-1), 0 is completely recessive, 1 is completely dominant
  • Alleles P and Q exist
  • for PP = 1 (baseline fitness score for the PP genotype. It’s saying that individuals with the PP genotype have a fitness that is not affected by the selection against the Q allele (since they do not carry it).
  • for PQ = 1 + hs: heterzoygous. If h = 0, Q is completelt recessive, and fitness for PQ = PP, and if h = 0, then Q is completely dominant, and fitness of PQ = QQ. If h is 0-1, then incomplete dominance
  • for QQ = 1+s. The baseline fitness plus whatever the selection is for this genotype. h is not considered as it is homozygous
273
Q

MT HG 5/6: Population Genetics

How to quantify the effects of selection pressure against an allele (s) and the degree dominance of an allele (h) affect fitness/reproductive success for genotypes for diploids?

A

Using the equations:
* PP = 1
* PQ = 1+hs
* QQ = 1+s

274
Q

MT HG 5/6: Population Genetics

what affects fequency of phenotypes in diploid population

A

level of dominance

275
Q

MT HG 5/6: Population Genetics

what is additive alleles

A

The alleles are co-dominant

276
Q

MT HG 5/6: Population Genetics

what is the quantitive way/equation to express change in allele frequency in diploid population from one generation ot the next? Please explain

A
  • selection and dominance in diploid population
  • p, q are allele frequencies for 2 alleles
  • ∆q = spq [ph + q(1-h)] the relative fitness effect of the q allele in heterozygotes compared to homozygotes.
  • ∆q is change in allele frequency q over 1 generation
  • and [ph + q(1-h)] is the relative fitness effect of the q allele in heterozygotes compared to homozygotes. If h=0.5, there is no difference in the relative fitness effect between the two types of homozygotes. If h>0.5, the q allele is more dominant, and if h<0.5, the q allele is more recessive.
  • pq = proportion/frequency of heterozygosity PQ in population
  • s times the whole equation: tells you whether q frequence increase (s is negative) or decrease (s is positive) in the next equation
277
Q

MT HG 5/6: Population Genetics

∆q = spq [ph + q(1-h)]
what does this equation actually measure:

A

∆q = spq [ph + q(1-h)]
* change in q allele frequency in diploid population from one generationto the next
* How much the frequency of the q allele is expected to decrease (if s is positive) or increase (if s is negative) in the next generation, taking into account the current allele frequencies and the selection pressure against the q allele.
* models the impact of selection on allele frequencies, giving us an estimate of the evolutionary dynamics within a population.

278
Q

MT HG 5/6: Population Genetics

what does the degree of dominance alrgely effect in diploid population?

A

the rate of allele change

279
Q

MT HG 5/6: Population Genetics

If a new, rare, selected allele is dominant in a diploid population, what would happen to the rate of allele change/fixation over time? Why

A
  • If the selected allele is dominant, change is initially rapid (increase), and faster in the middle but very slow as it nears fixation
  • It spreads initially very quickly because initially heterozygous spreads quickly
  • but hard to fixate as the unbeneficial or recessive alleles are hard to completely remove due to the ‘hidden’ nature of heterozygotes
280
Q

MT HG 5/6: Population Genetics

If a new, rare, selected allele is recessive in a diploid population, what would happen to the rate of allele change/fixation over time? Why

A
  • change is very slow initially but accelerates near fixation
  • very slow initially as chances of heterozygoues getting tigether to produce a selectively advatange recessive homozygous is rare and low in chances, so takes a long time
  • but once the population has a lot of recessivce homozygotes, the dominant not selected allele can’t hide and is removed quickly
281
Q

MT HG 5/6: Population Genetics

If a new, rare, selected allele is additive in a diploid population, what would happen to the rate of allele change/fixation over time? Why

A
  • Change is initially rapid and reaches fixation very rapidly (green line)
  • This is because less-fit alleles are more effectively selected against (they cannot hide)
282
Q

MT HG 5/6: Population Genetics

what is balancing selection represented by?

A
  • when heterozygous advantage occurs
  • where heterozygous pq is in excess of HW equillibrium
  • Both alleles will stably coexist with frequency that is proportional to the relative fitnesses of the two homozygotes
283
Q

MT HG 5/6: Population Genetics

what is a classic example of balancing selection?

A
  • ## Sickle cell anaemia: homozygotes suffer from anaemia but heterozygotes have protection against malaria
284
Q

MT HG 5/6: Population Genetics

Two other types of selection can maintain genetic variation in a population

A
  • Frequency dependent selection: the larger the frequency of a genotype, the lower its fitness, and vice versa
  • Example: Self-incompatibility genes in plants control the germination of pollen on the female stigma and discriminate between self and nonself. Successful pollination occurs only when pollen and stigma are of opposite types. In such cases, a population would theoretically maintain multiple self-incompatibility alleles.
  • Example 2: in warning coloration of plants to avoid bird predation.
  • Fluctuation selection: allele fitness depends on an aspect of the environment that is rapidly and constantly changing
285
Q

MT HG 5/6: Population Genetics

what does it mean when a trait varies quantitaively?

A
  • means not a simple on/off, but more on a scale. Varies in degrees/levels
  • often polygenic, and controlled by multiple loci
  • often represented in a normal distrivution (whereas single loci is skewed distribution)
  • quatitative traits: i.e hair color, height, eye color, skin color (varies and controlled by a lot of factors)
286
Q

MT HG 5/6: Population Genetics

what is quantitative trait loci (QTL)?

A
  • specific regions of the genome that correlate with the variation in a quantitative trait.
  • Mapping these loci can be complex because each locus may have a very small effect on the trait.
  • requires specific statistical approaches
287
Q

MT HG 5/6: Population Genetics

what are some fields which are important application of genetic dominance/structure?

A
  • Crop and animal breeding: picking advantageous alleles
  • Conservation genetics: ensuring conserved and variation in genetics to avoid certain diseases
  • Ecology: conservation of genetic variations
  • Forensic sciences: paternity tests
  • Disease epidemiology: mapping spread of disease in population
  • Anthropology and archaeology
  • Medical genetics
288
Q

MT HG 7: Coalescent Theory

How does coalescent theory link with molecular phylogenetics?

A
  • after constructing tree, can be used to interpret macroevolutionary models FOR SPECIES LEVEL: (such as co-evolution, adaptive radiation..etc)
  • It can also be used to interpret POPULATION LEVEL PROCESSES: such as birth, death…etc
  • and population level processes is interpreted by coalescent theory
289
Q

MT HG 7: Coalescent Theory

what is the coalescent theory?

A
  • a model of how alleles sampled from a population may have originated from a common ancestor.
  • In the simplest case, coalescent theory assumes no recombination, no natural selection, and no gene flow or population structure, meaning that each variant is equally likely to have been passed from one generation to the next. (variations exist which include more factors like gene flow)
  • The model looks backward in time, merging alleles into a single ancestral copy according to a random process in coalescence events.
  • Coalesce = to come together as one
  • Consider a single gene locus sampled from two haploid individuals in a population. The ancestry of this sample is traced backwards in time to the point where these two lineages coalesce in their most recent common ancestor (MRCA). Coalescent theory seeks to estimate the expectation of this time period and its variance.
290
Q

MT HG 7: Coalescent Theory

what are some population-level processes? How can they be inferred by coalescent theory?

A
  • population-level processes: structures/changes to a population
  • coalescent theory can be used to understand certain population-level processes such as
  • disease gene mapping: how a certain disease gene like cystic fibrosis has been inherited through the family,
  • genomic distribution of heterozygoisty through tracking SNPs
  • population bottlenecks
  • population growth
  • source-sink dynamics
  • introgression (hybridisation via repeated back-crossing)
291
Q

MT HG 7: Coalescent Theory

applications of population genetics

A
  • disease gene mapping: how a certain disease gene like cystic fibrosis has been inherited through the family,
  • genomic distribution of heterozygoisty through tracking SNPs
  • population bottlenecks
  • population growth
  • source-sink dynamics
  • introgression (hybridisation via repeated back-crossing)
292
Q

MT HG 7: Coalescent Theory

how is coalescent theory linked to population genetics

A
  • extension of population genetics and normally assumes neutrality, and
  • selects sequences from more neutrally evolving portions of genomes for analysis
293
Q

how to detect whether genes have undergone selection

A
  1. differences in genetic diversity
  2. compare silent and replacement changes within a gene
  3. McDonald Kreitmen test:
  4. look for parallel/convergent evolutionary changes
294
Q

what are some possible outcomes of selection?

A

dN/dS = 1: neutral evolution
dN/ds < 1 = purifying or natural selection, nonsynonymour occuring at lower rate = conservation of proteins eq and function

295
Q

describe hard sweep, multiple mutation soft sweep, and single mutation soft sweep

A
  1. hard sweep: genetic hitch hiking: neutral mutation close to selected gene dragged across
  2. multiple beneficial alleles sweeped through, bringing heir linked genes
  3. single mutation sweeps through due to shift in selection. soft single sweep is due to this neutral allele existed prev, but bcuz of shift in selection, it became beneficial
296
Q

what is the effect of selective sweeps on linked genetic variation? Example

A
  • reduction in genetic diversity around dhfr locus in malaria
  • genetic diversity increases as you move away from locus
297
Q

what are the fates of mutations?

A
  • dependent on selection coefficient: the relative fitness between 2 genotypes or alleles
  • s>0 : positive selection, hence fixation of a gene
  • s~0: can lead to genetic drift.
  • hence can be fixation or elimination
  • s<0: negative selection leads to elimination􏰣
298
Q

what is the equation for probability of fixation?

A

Ns is a composite parameter of population selection strength
* For neutral mutations (Ns=0) , P (probability of fixation) = 1/N , k = Nup = u
* For strongly advantageous mutations (Ns>1), p ~2s , k = Nup~2Nus
* For strongly disadvantageous mutations (Ns<-1), p ~0 , k = Nup ~0
* Stronger the positive selection, the faster rate of substitution
 Negative selection = no fixation rate

299
Q

how do you use the dN/dS method

A
  • If all replacement mutations are neutral, then dN/dS = 1
  • If all replacement mutations are deleterious, then dN/dS = 0
  • dN/dS > 1 only if at least some replacement fixations are beneficial
  • When applied to whole genes, dN/dS is usually much less than 1.
  • because only a few codons are positively selected.
  • Most codons are selectively constrained (i.e. strong negative selection) to conserve the functions of important codons and therefore have a dN/dS close to zero.
300
Q

how does dN/dS vary among genes

A
  1. high dN/dS in codons that form active sites of genes
  2. due to host pathogen arms race
  3. high positive selection for non synonymous mutations which allow diversification to detect pathogen antigens
301
Q

whn are silent changes not neutral?

A
  • Overlapping genes and alternate reading frames.
  • Very common viruses with small genomes (e.g. HIV)
  • Regulatory sequence elements
  • Promotors, enhancers, splicing elements, miRNAs
  • Affect stability of RNA / mRNA / DNA structure
  • Codons for the same amino acid differ in fitness
  • May result from
    (i) different tRNA abundances,
    (ii) different translational efficiency / accuracy.
    Changes to processivity of translation may also affect protein folding.
  • Gives rise to codon bias and codon pair bias.
  • Biases in the process of mutation may also cause some codons to be used more often than others.
302
Q

what is the MK test?

A
  • MK test is a stats test used to detect natural selection in molecular sequences
  • The MK test compares the number of non-synonymous mutations to synonymous (silent) mutations within a species (polymorphic sites) and between species (fixed sites).
  • If results show no selection, then the sites differences are purely due to genetic drift and neutral evolution, which would mean that the ratio of silent:replacement fixed sites are equal to the ration of silent:replacement polymorphic sites
303
Q

definition of virus

A
  • small infectious particles/agents replicate inside living cells
  • high mutation rates
  • measurable in evolving populations
  • evolution occurs between hosts and within hosts
304
Q

HIV-1 characteristics

A
  • Single genome
  • New diversity generated by mutation and recombination
  • Gradual evolution
305
Q

Influenza-A characteristics

A
  • Comprises 8 genome segments,
  • each segment encoding 1 or more genes
  • New diversity generated by mutation and reassortment
  • Reassortment between segments can also occur
  • Contrast w HIV:
  • No recombination in Influenza, whereas HIV has lots of recombination of segments
306
Q

what are some examples of viruses in th ebaltimore classification?

A
  • reverse transcriptase = very error prone
  • Retroviruses: like HIV are group VI viruses; Use reverse trascriptase to replicate
  • Error prone and high mutation rate; Generally faster evolution and adaptation
  • Hepatitis B Virus (VII), Use reverse transcriptase to replicate; Error prone; High mutation rate
  • Coronavirus family (type IV); Uses RNA-Deoendent RNA polymerase (RdRp) to replicate; Error prone
  • Influenza (Type V); Uses RdRp; Also error prone
307
Q

how do viruses evolve?

A
  • within host scale
  • and between host scale
  • often evolution for within host is detrimental for trasnmission/between host scale, hence virus must choose one scale to evolve at
  • Within host :
  • Multiple sequence from the same individual at different time
  • Virus evolves during its infection period when it is in host
  • Between host/population level
  • This approach looks at the evolution of the virus across different individuals.
  • Consensus sequences are derived from different individuals, possibly at different times or in different locations, to understand how the virus is evolving at the population level.
  • This can reveal patterns of transmission, how the virus adapts to different populations, and the emergence and spread of new variants
308
Q

what is an example of between host transmission tracking?

A
  • SARS-CoV-2 virus responsible for COVID-19, between-host genomics has been crucial for tracking the spread of the virus and identifying the emergence of new variants that may be more transmissible or evade vaccine-induced immunity.
309
Q

what are the 3 classifications of viral infections?

A

1) acute infection:
* influenza/sars
* RNA viruses
* lselection for transmission
* can cause chronic infections in immunocomprimised patients

2) latent persistent infections
* herpes simplex cirus
* DNA virus - lower mutation rate
* short bust of high viral load, go into dormant before next burst
* selection for transmission is important

3) chronic persistent
* HIV-1
* high mutation
* ongoing rapud evolution
* within host selection more important

310
Q

what are some vidence showing within host evolution in HIV-1

A
  • Data from 11 HIV-1 infected individuals serially sampled for about 8 years and using whole-genome sequencing
  • Selected mutations typically involve evasion from host immunity
  • Mutations that are selected for in some individuals are selection against in others
  • Divergence, showing HIV-1 adapted to specific host environment
311
Q

what is virus toggling/adapt and revert?

A
  • Adapt and revert can explain why we see faster rates of evolution within individuals
  • Virus have within host adaptation, and then be transmissed to another individual where it adapts and reverts to its previous/original states if the conditions change again
  • a virus might ‘adapt’ by mutating in a way that benefits its survival or transmission within a specific host, but then those mutations may ‘revert’ to a previous state if that adaptation is no longer advantageous in a new environment or host. This can be a way for viruses to maintain fitness across diverse environments and hosts, and is part of the reason why viral evolution can be so rapid and difficult to predict or control
    *
312
Q

what is an example of adapt and revert?

A
  • HIV viruses
  • reversion of CTLs to escape after transmission
  • Mother infected with HIV is HLA B577/5801 positive, which means it has HLA + type that affects her immune response. The virus in mother has evolved to have CTL escape mutation T242N adapted to evade mother’s immune response, giving virus fitness advantage
  • Child Viral strain: after HIV transmission to child, who is HLA B57/5801 negative. The T242N mutation of HIV now is not adapted to child immune response, and has a fitness disadvantage,
  • Hence virus reverts back to wild-type without the mutation
313
Q

how does toggling impact rate of evolution in viruses?

A
  • slow rate of evolution between host
  • due to constant toggling and reverting which produces diff selection pressure
314
Q

how can acute infections become chronic?

A
  • i.e in COVID-19
  • long branches leading to variants of concerns
  • long branches are due to within host evolution during a chronic infection
  • causing within host and between host evolutions
315
Q

what is an example of viruses selection at population level

A
  • Monkeypox is a dsDNA virus (related to smallpox)
  • Natural low evol rate: 9x10-6 per sub/site/year (1-2 nucleotide changes per year)
  • But in human populations the substitution rate has been much higher
  • Like a consequence of the action of APOBEC3 in humans:
  • APOBEC3 protein in humans can induce hypermutation in viral DNA: which can lead to a higher mutation rate in viruses that infect humans.
  • Hence a higher substitution rate
  • This may drive the monkeypox to adapt faster to infect human population
316
Q

what is the C value paradox?

A
  • the lack of correlation between organism complexity and genome size
317
Q

what are 4 reasons for the 95% non coding DNA in euks

A
  1. It has essential global regulatory functions for gene expression.
  2. It is “junk” DNA with no use, carried passively by chromosome simply because it is linked/hitchhiked to functional genes
  3. It serves a structural role in the genome, related to chromosome structure or nucleoskeletal function (i.e related to cell volume) but not to carry information.
  4. It is “selfish” DNA that exists only to replicate itself, offering no benefit to the host.
318
Q

what are 4 theories arguing that genome size is adaptive

A

1: Skeletal function of Genome size and Cell Volume: requiring more DNA for larger cells. ensure a constant nucleus-to-cell volume ratio for balanced RNA and protein synthesis
backed by evidence in algae: DNA content correlated in nuclear volume

319
Q

arguments for non-adaptove genome sizes

A

effective population sizes are too small for natural selection to effectively remove non-coding DNA from eukaryotic genomes
- NeS<1 so genetic drift dominates evolutionary dynamics
- Hence see fixation of slightly deleterious/non-functional genes
- In contrast, large effective population sizes in bacteria could allow removal of non-coding DNA due to Natural Selection

320
Q

difference between bacterial and euk DNA

A
  • non-coding DNA little in bacteria, becuz it takes way to long to replicate from single origin
  • too long of a generation time and would be incredibly costly
321
Q

what are the types of non-coding DNA in euks?

A
  1. satelite DNA
  2. minisatelites
  3. TE and endogenous retroviruses
  4. spacer DNA
322
Q

satelite DNA vs micro/minisatelites

A
  • both are repetitive
  • satelite more repetitive
  • satelites located in heterochromatin
  • mini and microsatelites have G-rich core
323
Q

what is the evolution of mini/micro satelites

A
  • Both have extremely high mutation rates
  • mostly caused by point mutation, unequal crossing over and DNA slippage (DNA mispair during replication and recombination)
324
Q

what are transposable elements

A
  • Selfish DNA sequences which are able to increase their copy number by jumping (transpose) around the genome and making additional copies of themselves as they do so.
  • ~50% of genomes of some eukaryotes including humans (42.5%)
  • Major component of non-coding DNA in HUMANS
325
Q

how to classify transposable elements in eukaryotes?

A
  1. class I elements (retroelements): further split into LTR retrotransposons and non-LTR retrotransposons
  2. Class II elements (DNA elements)
  3. MITES
326
Q

LTR retrotransposons?

A

a) Have long terminal repeats (hundreds of bp repeats)
b) Include retrotransposons, endogenous retroviruses
c) Move within genome by transcribing their RNA into cDNA then inserted within genome
d) The LTR are regulatory sequences for retrotransposons, play critical role in reverse transcription
i. Promotor, terminator, and helps stabilize retrotransposons
e) Present in human genome but most are inactive, apart from the HERVs which are still somewhat active

327
Q

what are non-LTR retrotransposons

A

a) More active in human genome causing mutations and genetic variation
b) No LTR
i. But transcribed RNA includes reverse transcriptase which has ‘target primed reverse transcriptase (TPRT), which allows the target DNA to act as a primer for reverse transcriptase to transcribe RNA to DNA

328
Q

what are class II DNA elements?

A
  1. Ac-like elements, activator of TE
  2. contain transposase
  3. Transposase (helps movement within genome)
  4. Terminal inverted repeats (TIRs), transposase will recognize this class of TE as they are flanked by TIRs
  5. Cut and paste mechanism: transposase cuts the element from one location in the genome and pastes/integrates it into another location.
  6. Non-replicative: does not involve replication of the element, meaning that it is excised from the original location when it is inserted into a new location.
  7. Different from class 1, where it is replicated and integrated
329
Q

what are MITES

A
  1. Don’t make any proteins, rely on other transposon’s protein to copty itself
  2. Also has TIR flanking egions
  3. Non-autonomous: don’t encode for enzymes necessary for movement
  4. Rely on transposases encoded by other autonomous elements
  5. High numbers of copy in genome
  6. Their insertion in target site associated with short direct repeats
330
Q

what are the functions of these TEs

A
  • regulate gene expression
    1. MITEs can affect gene expression of nearby genes and can alter chromatin structure
    2. Genomic diversity by inserting and integrating into various sites
    3. Evolution: contribute to genomic rearrangement and speciation process
    4. Mutation: integration of TE can disrupt gene fucntion and lead to mutations
331
Q

what are major groups of non-LTR transposons

A
  • LINES: which are very common in eukaryotes;
    insertion of retroelements into genes can cause deleterious mutations. For example, insertion of L1 into the gene for factor VIII can cause haemophilia, codes reverse transcriptase
  • SINES: 1. although these do not encode their own reverse transcriptase
    2. parasitize inverse transcriptase from LINEs
332
Q

how does TEs cause DNA gain and lost

A
  • TEs accumulate rapidly in genomes
  • up to 100 copies in 1 generation
  • maize genome increased 50% in 6 million years due to insertion of 23 retrotransposons
  • may be difficult to remove, but they accumulate fast
333
Q

what is a significant example of P elements in Drosophila

A
  1. Wild flied carry P elements while lab strains didn’t
  2. D. melanogaster the insertion of P elements can lead to hybrid dysgenesis (an increased infertility due to chromosome breakage)
  3. Hybrid dysgenesis only occurs in the offspring of crosses between females from laboratory strains and males from wild strains.
  4. transposable elements like P elements might have create reproductive isolation
    * Causing speciation??
    * Possibility of TE in causing or contributing to speciation
  • P element is a 3kb clas II element
334
Q

describe the retrovirus life cycle?

A
  • this type of life cycle - endogenous life style
  • Retroviral infection of germline
  • Fixation
  • Amplification
  • Inactivation through mutations
  • Loss through recombinational deletion
  • Decay into junk
  • Co-option (changes in function, or new function arises)
335
Q

describe the activity of ERV and Human ERV

A
  • ERV were acquired through endogenous life style
  • all 3 classes of ERVs active in mice, but onlu Class 2 is still somewhat active in humans
  • HERV-K(HLM2) still active in humans and can still express envelope preoteins, and can cause cancer
  • rare retro transposition of HERVs can lead to mutagenesis,and genetic disorders but this is rare
  • HERV-K(HLM2) encodes Rec protein, when injected into immunocomprimised mice, it induced tumour formation
336
Q

what is the consequence of EVR activity forming co-option

A
  1. co-option:
  2. salivary amylase gene, with LTRs acting as promoters for regulation of this gene
337
Q

consequence of ERV leading to recombination?

A
  1. ERVs and TEs can lead to chromosomal rearrangement through homologous recombination between distant loci
    - In the HERV-K(HML2) family, 16% of elements may have been involved in large scale rearrangements of the human genome.
  2. Most large scale rearrangements will be highly deleterious.
    - HOWEVER, The ectopic exchange hypothesis predicts that TEs will be preferentially found in regions of low recombination.
    - mammals undergo lower levels of ectopic exchange maybe due to high levels of retroelement activity in early evolution
338
Q

what is the ectipic exchange model?

A
  • ectopic exchange - recombinations between TEs adn other positions on genome. Ectopic exchanges happens between sequences at diff locations causing chromosomal rearrangement
  • Selection against TEs that cause ectopic exchange is the major force limiting TE copy numbers in genomes. because TE ectopic exchange is deleterious
  • Prediction: TEs likely to be found in regions that undergo lower levels of meiotic recombination because these areas have higher risk of ectopic exchange and would have been selected against
  • significant test results in humans and flies which outcross
  • less significant in HERVs for unknown reasons and self-fertilising
339
Q

what are functional non-coding DNA?

A
  • They include regulatory elements:
    1. Promoters, enhancers, silencers..etc
    2. Non-protein-coding genes: rRNA, tRNA, microRNA
    3. chromosomal structural elements: centromeres and telomeres, other elements which help chromosome folding and nuclear organization
340
Q

how to identify functions of non coding DNA

A
  • measure its evolution rate or fixation rate across species, if it is very slowly evolving = likely to be highly conserved
341
Q

what is an example of accelerated regions showing species specific adaptation?

A
  • human accelerated regions
  • Identified that HAR1: most rapidly diverging region in HARs is part of a gene that produces RNA (not protein)
    1. This RNA gene is involved in the development of the human cerebral cortex (part of brain for complex cognitive functions)
    2. Evolution of HAR1 may have contributed to the unique aspects of human brain development and function
    3. HAR 1 is expresent during development of human neocortex (7-19 weeks of gestation)
  • Maybe HAR1 is responsible for the unique cognitive abilities of humans
342
Q

first gene sequencing/sanger sequencing

A

How It Works:
1. DNA Preparation: DNA is first cloned into vectors and amplified.
2. Primer Annealing: A primer is annealed to a single-stranded DNA template.
3. Chain Termination: Incorporation of chain-terminating dideoxynucleotides (ddNTPs) during the DNA synthesis process. Each ddNTP is labeled with a different fluorescent dye.
4. Electrophoresis and Detection: The resulting DNA fragments are separated by size using capillary electrophoresis, and the ddNTPs are detected based on their fluorescent labels.
* Key Features:
1. Produces long read lengths (up to 900 bases).
2. High accuracy but relatively low throughput and higher cost per base compared to newer methods.

343
Q

ilumina sequencing

A

How It Works:
1. Library Preparation: DNA is randomly fragmented, and adapters are ligated to both ends of the fragments.
2. Cluster Generation (Illumina-specific): DNA fragments are attached to a solid surface and amplified locally to form dense clusters of identical DNA molecules.
3. Sequencing by Synthesis (Illumina): Incorporates fluorescently labeled nucleotides one at a time, imaging each incorporation event.
* Key Features:
1. Provides shorter read lengths than Sanger sequencing (typically 100-300 bases) but much higher throughput.
2. Lower cost per base and capable of sequencing millions of fragments simultaneously.
3. Ideal for whole-genome sequencing, resequencing, and various genomic applications.

344
Q

Next Generation sequencing

A
  • allows for massive parallel sequencing of DNA and RNA
    1. Library Preparation: Similar to other second-generation methods, NGS involves preparing a library of DNA fragments with adapters added to each end. The library can be tailored for different applications, such as whole-genome sequencing, exome sequencing, or targeted sequencing.
    2. Cluster Generation: On platforms like Illumina, once the library is prepared, DNA fragments are attached to a flow cell where each fragment is clonally amplified.
    3. Sequencing:
  • Sequencing by Synthesis (e.g., Illumina): This involves synthesizing the complementary DNA strand and capturing images after each nucleotide addition. Each of the four nucleotides is tagged with a different fluorescent marker.
  • Nanopore Sequencing (e.g., Oxford Nanopore): Involves passing a DNA strand through a nanopore and measuring changes in electrical conductivity caused by different bases.
    4. Data Analysis: The raw data generated by these technologies are then processed using bioinformatics tools to align reads and identify genetic variants.
345
Q

what causes large plant genomes

A
  • whole genome duplication events
  • arises from polyploidization events followed by chromosome reshaping
  • it can either lead to WGD being fixed if chromosome reshaping is involved
346
Q

key points for EGD

A
  • WGD doubles genome size and gene number, initially
  • Acting over 100s of MYA
  • Has profound effects on the evolution genome
    architecture
  • Underpins innovations and adaptation in flowering plants (Angiosperms)
  • Not sufficient to account for very large genome sizes (monocots, gymnosperm
347
Q

what are plant TE?

A
  • Plant LTR-retrotransposons are classified into two super-families,
  • Ty1/copia and Ty3/gypsy.
  • Enormous number of families with highly diverse DNA sequences;
  • these are usually specific to a single or a group of closely related species
348
Q

retrotransposons in monocots

A
  • LTR-RT comprises 30-70% of monocot genomes
  • Maize: 65% of the genome is LTR-RT sequence, representing 1.5 Gigabases
  • LTR-RT tend to be highly nested; this makes genome sequence analysis very challenging
  • Whilst most LTR-RT are degenerate and inactive, stress tends to active the movement of
    intact copies
349
Q

key points on retrotransposons for plant genomes

A
  • LRT are most abundant TEs in nearly all plants
  • their abundance is generally what causes large genomes
350
Q

how does polyploidization arise?

A

Two main types of interest
1. Autopolyploidy: Multiple chromosome sets derived from a single taxon. Results from No chromosome disjunction during meiosis producing abnormal gametes OR Spontaneous, somatic genome doubling, which has been observed in plants (e.g. fruit trees).

  1. Allopolyploidy. Multiple chromosomes derived from two or more diverged taxa.
351
Q

what causes genome architecture

A
  • Chromosome reshaping following WGD accounts for much of the diversity in the makeup and structure of plant genomes
  • The fate of duplicated genes is influenced by major evolutionary forces including adaptation
352
Q

what are summaries of polyploidization in plants

A
  • Can cause rapid and short term phenotypic change. Multiple (e.g. gene expression) and complex molecular effects result from combining different genomes
  • Polyploids are common across Angiosperms but infrequent in Gymnosperms
  • Phenotypic effects can be large, and several of the “beneficial” changes have been captured in
    plant domestication and the development of plant varieties used in agriculture