Genetics Flashcards
briefly outline previous findings on genetic evidence for human diversity
- Evidence for Out of Africa through mtDNA (Cann, Stoneking, Wilson 1987)
- Dating of human/chimpanzee divergence time (first Sarich and Wilson 1967)
- Identifying new ancient hominins and resolving our relationships with them (e.g. Denisovan full genome, Meyer et al 2012)
- ‘Seeing’ genetic evolution through selection studies
Outline Darwins contributions to genetics
- Evolution by natural selection in The
Origin of Species (1859) - Requires- heritable variation in reproductive success
Outline historical models on how inheritance occurs
- Preformatism, 17th-18th century- ovists vs spermatists
- Blending, 19th century – mix of maternal and paternal
- Darwin- Pangenesis (1868)- gemmules generated by body parts continuously- migrate to gonads, develop into cells/organs they were from
Outline Mendel’s contribution to genetics
- Experimental, breeding peas, from 1856-1863 (>30k plants)
- Published in 1866
- Suggested discrete units of inheritance. ‘Particulate inheritance’ where discrete units (genes) control for discrete traits
- genes come in different forms (alleles)- not blended inheritance (as would have semi wrinkled rather than wrinkled or smooth)
- principle of segregation- One pair of genes per individual (diploid), the two gene copies in the parent segregate during reproduction, with one random copy being passed on by each parent
- idea of dominant and reccessive traits
- law of independent assortment- Traits are inherited independently of one another (but sometimes linkage)
Outline Galton’s/Fishers contributions to genetics
- Conflict (?) between ‘complex’ traits like height – individuals follow the average parental height, blending
- Fisher (1918) showed that Mendelian inheritance consistent with biometric observations
List ‘classic’ results in population genetics
- Hardy-Weinberg equilibrium (1908)
- R. A. Fisher on Mendelian traits (1918)
- J. B. S. Haldane (A Mathematical Theory of Natural and Artificial Selection, 1924-1934)
- S. Wright (e.g. Evolution in Mendellian Populations, 1931)
Outline the catch up of cell biology with Mendelian genetics
- 1869: Miescher named a
chemical found in nuclei of cells - nucleic acid - 1880s: Weismann proposed that bodies in cell nuclei called chromosomes were the basis of Heredity
- 1953: Watson, Crick, Franklin, Wilkins elucidated structure of one of Miescher’s nucleic acids, DNA
Outline the basic elements of DNA
- DNA: deoxyribonucleic acid Nucleotides: A, C, G, T
- Base pairing = purine (adenine or guanine) + pyrimidine (thymine or cytosine)
- Semi-conservative replication – after replication, each new double helix is formed of one original strand and one new one
- Mistakes can happen during replication, leading to variation
Outline the genome
- definition- total DNA content of a cell
- ~3.2bn base pairs per genome copy
- Two genome copies in most human cells (‘diploid’ = two copies of each chromosome, ‘haploid’ = one copy)- exceptions are germline cells (sperm, egg; haploid), some somatic cells e.g. hepatocytes in the liver can by polyploid (>2 genome copies/cell)
- Split into 22 pairs of autosomes and 1 pair of sex chromosomes
- Haploid n = 23, diploid 2n = 46.
- Also many mitochondria in cells (1-2.5k per cell, ‘power plants’ of the cell), each with several copies of mtDNA (mitochondrial DNA)
Outline the chromosomes
Autosomal chromosomes (22 pairs):
* Vary in length (47-249 Mb)
* Inherit one copy from each parent
Sex chromosomes (1 pair):
* X and Y, genetic sex determination
* XX (genetically female) and XY (genetically male);
rarely, other karyotypes, often with health impacts
* Y = paternal inheritance, passed from father to son
* X = female-biased inheritance, mother passes on one
and father does to XX females
Outline mitochondrial DNA
- Short, 16,569 bp circular chromosome
- 1000s of near-identical copies per cell
- Maternal inheritance ~only
Outline the function of the genome
- carries ~20,000 protein- coding genes.
These are transcribed into mRNA and then translated into proteins, with some intermediate steps (e.g. splicing, which determines which exons are translated, can lead to alternative protein isoforms; protein folding) - The Central Dogma (Crick 1958) states that information (here, the sequence code) cannot be passed on by proteins. Simplified to DNA -> RNA -> proteins
outline translation/coding
- Translation occurs amino-acid by amino-acid, with each DNA triplet (codon) coding for an amino acid
- The coding is redundant – multiple codons code for the same amino acid
- The universality of this code among living organisms is proof of the common ancestry of all life
Outline the composition of the human genome
- Only 1.5% of the human genome is protein- coding (exons)
- introns make up rest- thought to be under evolutionary constraint (biological role), some can affect gene regulation
- E.g. the ENCODE 2012 project estimated that 80% of the genome is biochemically active and therefore might have some impact on gene regulation
Outline the link between genome size/complexity and organisms
- Human do not have particularly big genomes, but there are broader trends – e.g. eukaryotes, and especially vertebrates, have larger genomes.
- more ‘complex’ organisms have relatively more non-coding DNA- complex regulation may be especially important
Outline genetic mutations, including types
- ‘Random’ – but the rate of mutation varies along the genome (1.25e-8 / bp / generation)
- somatic (not carried in germ cells- rusk varies on when ind development occur, cancer is risk) and germline (all tissues and in half of gametes)
- Allows novelty in evolution, divergence between species, variation among individuals
Outline types of genetic mutations
- indel- insertion/deletion
- point- creates SNP (single nucleotide polymorphism)- can be synonymous or missence), or nonsense (stop)
Outline a particular area of new diversity due to mutations
- short tandem repeats- STRs (micro satellites)- short repeats of DNA motif that occur sequentially
- These mutate quickly by adding or losing repeats due to ‘slippage’ during meiosis
- Due to the high mutation rate they are very diverse, and are used extensively in forensics and for paternity testing
Outline structural genetic mutations
- can remove, or duplicate, many genes at once
- can have profound phenotypic impacts
Outliene 2 examples of effects of genetic mutations
- MC1R gene: switch from dark to red melanin- Recessive inheritance, several different non-synonymous variants in Europe causing varied degrees of loss of function,
- Caspase 12 gene: various immune functions, e.g. truncated form increases risk of mild bacterial infections but decreases risk of sepsis, truncated form at high frequency, especially
outside Africa
Outline regulation (mutations)
- Changes can also be how much protein you make; when the protein is made (in utero, childhood, adulthood; certa times of day or year); and where the protein is made (e.g. brain and gut but not lungs)
- Cis regulation: variation impacting regulation in nearby genes
- Trans regulation: variation impacting expression/function of protein products that in turn regulate other usually distant genes
Outline recombination
- refers to when a segment of a chromosome is swapped between the two chromosome copies of an individual
- Occurs during meiosis
- Quite frequent (e.g. over the genome, average 41.1 in mothers and 26.4 in fathers / meiosis, Chowdhury et al 2009)
- Average rate ~1/100Mb, but huge variation creating ‘recombination hotspots’ – sometimes with 1000x the average genome-wide rate
- mtDNA and the majority of the Y chromosome don’t recombine
- less frequent between genes that are nearby on a chromosome (linked)
- Distant genes are inherited more independently (more like genes on different chromosomes)
- non-random association of alleles at different loci is called ‘linkage disequilibrium’ (LD)- High LD suggests genes are close each other on a chromosome
- Recombination mixes up the code on copies of a chromosome creating new combinations of variants – new haplotypes. But not new variants (in itself)
- considered a major evolutionary advantage of sex because it breaks up associations e.g. if a chromosome has one gene version that is very bad for survival and a different gene that is very advantageous, recombination breaks up the association so selection can act independently
Outlien diversity in the human genome
- approximately 20 million base pairs will differ between ones 2 human genomes copies
These are caused by 3.5-4.5 million single nucleotide polymorphisms (SNPs), which impact one base pair; 5-600k indels; and significant numbers of larger deletions - each carry more than 100 ‘dead gene’ copies. Most LoFs are heterozygote, i.e. with one good copy intact, but some that impact non-essential genes are homozygous (brackets)
When is a gene called essential, how many instances of this are there in the human genome
- when the loss of its function compromises viability of the individual (for example, embryonic lethality) or results in profound loss of fitness
- ~ 3,000 human genes cannot tolerate loss of even one of the two gene copies (haploinsufficiency) (Bartha et al. 2018)
Outline the nature/history of data regarding human genome sequencing
- Human genome project – started sequencing 1990, main publications in 2001, ran until 2003- cost approx 3b fr one genome
- then cost rapidly decreased- in 2022, average of $525
- lead to massive fats increase- ENA – European Nucleotide Archive
21.3 trillion bases, equivalent to ~6.7k human genomes, SRA – Sequence Read Archive
~73,700 trillion bases, equivalent to >23m human genomes - if stored UK population, would be 13 exabytes
- Huge challenges – physical storage, data sharing, bioinformatic processing, data analysis.
- also issue with Privacy and ethics: who can access, for what? Who ‘owns’ the data? Security?
What is a holotype
a set of linked variants
diagram of what genome data looks like
Ape phylogeny diagram
outline genetic divergence in the great ape phylogeny
- Humans and chimpanzees/bonobos: 98.8% sequence identity =120 differences/10,000 base pair
- Humans and Neanderthals: 99.87% =13 differences/10,000 base pair
- Two Yoruba individuals (West Africa): 99.9% =10 differences/10,000 base pairs
- Two French individuals (W-Europe): 99.93% =7 diffferences/10,000 base pair
Outline genetic divergence in the great ape phylogeny in relation to time
- Mutations build up over time- suggests greater average genetic divergence = more time- Split dates estimated from
genetic divergence - However, ‘Molecular clock’ assumption – not always valid- assumes mutation rate is the same, generation time is the same, selection (e.g. removing damaging variants) is the same
Outline the link between genetic diversity and genetic divergence
- Genetic diversity and genetic divergence are connected – greater divergence within group means it has greater diversity
- Average divergence between individuals from same human population. E.g. French ancestry individuals differ at ~0.07% of the genome while Yoruban (from Nigeria) ancestry individuals differ at ~0.10% of the genome
outline Human-chimp functional divergence
- Humans and chimpanzees show ~98.8% sequence identity
- share >99% of our ~20,000 genes.
Human and chimps have – - ~35m SNP differences
- only 100k exome differences
- only 40k amino-acid changing
- another ~3% structural variations (deletions, insertions, inversions)
- Half on the human lineage, identify using an outgroup (orang-utan or gorilla)
- Most of the genome can be aligned (coloured regions on right), but some is hard to sequence (white) or align (colour mismatches)
- Human Chromosome 2 is the result of the fusion of two chimpanzee chromosomes (usually called 2A and 2B) that happened millions of years ago
Outline a study into the effects of Human-chimpanzee functional genetic divergence
Nielsen et al, 2005:
- comparison of 13,731 human genes with their chimpanzee orthologs (genes with common ancestry through speciation), 35 showed excess of non- synonymous changes
- many mutations don’t change the amino acid (synonymous), calculate the proportion of n.s. and s. differences, ask if high relative to expectations
- Biological process categories (e.g. Gene Ontology/GO Term enrichment) with an excess of putatively positively selected genes were immunity and defence, Gametogenesis / fertilization / sperm motility, and Chemosensory perception / olfaction
outline suggestions for explanations of Human-chimpanzee functional genetic divergence
- may be that more protein changes suggest positive selection e.g. in testes for fertility/sperm competition
- or, could just indicate less purifying selection- e.g. sex-specific genes only expressed in one sex, so invisible to evolution half the time and less selection?
Outline a common location of human-specific mutations in the genome
Pollard et al (2006):
- many short DNA regions that are conserved in other animals with many derived mutations
- human accelerated regions (HARs)
- most HARs not in exon - suggests functional roles may be in gene regulation
- many in genomic regions with many genes involved in neurodevelopment.
- Positively selected regulatory roles?
- But many HARs in regions with high recombination rate, bias toward A/T to G/C mutations- could suggest runaway feedback in a specific mutational processes (‘GC-biased gene conversion’), causing many mutations
Outline recent work on HARs
Keough et al, 2023:
- ~30% of HARs are nearby human-specific structural variants that change local gene regulation interactions
- pressure for these HARs to adapt to new regulatory interactions may drive rapid divergence.
Note: HARs are by definition ‘unsual’ genomic regions, and different HARs may have different explanations (adaptive regulatory effects vs neutral mutation properties)
Outline examples of Genes involved in human-chimp phenotypic divergence, aside from language and neurodevelopment related ones
- HACNS1 (a HAR)- involved in limb and digit development, precise phenotypic impacts not fully clear
- Human-specific growth hormone receptor 3rd exon deletion (GHRd3) associated with birth weight, life history traits – hypothesized also enabled ancestors to survive extreme malnutrition
Name an example of a involved in human-chimp phenotypic divergence- LANGUAGE
FOXP2
Outline research into FOXP2
- Lai et al, 2001- FOXP2 mutations in a family with language disorders
- Enard et al, 2002- FOXP2 has human- specific non-synonymous mutations (but long gene- 603kb (vs 24 median)- suggests strong likelihood of mutations
- Enard et al, 2009- FOXP2-humanized mice show specific changes in dopamine levels, neuronal morphology, synaptic plasticity in the striatum, and pup vocalizations
- Fontenot, 2014- accelerated FOXp2 evolution in the human lineage - hub gene in a human coexpression module
- involved in regulation of hundereds of other genes, including foetal brain development (Spiteri et al. 2007), and the lungs (Shu et al, 2007)
Outline dating of FOXP2 variants
- variants are mostly old- Neanderthals had ‘human’ derived amino acid changing variants
- limited evidence of recent selection in different human populations, despite possible recent additional intronic regulatory changes (Atkinson et al 2018)
Outline other language-gene associations
- Specific language impairment loci SLI1 and SLI2 related to a child’s ability to repeat nonsense words
- CYP19A1 mutations in humans have been found in association with dyslexia, while in other vertebrates (fish and birds) orthologs are known to be involved in sexual differentiation of the brain and the regulation of vocalization
Summarise fundings on language related genes involved in human-chimp phenotypic divergence
Human language may have qualitative differences from nonhuman primate communication, but genetic and biological basis is complex, including genes with continuity with other animals
List evidence for neurological-developmental Genes involved in human-chimp phenotypic divergence
- micorcephaly (MCPH)
- Brain size – DUF1220 domain
- Brain size – NOTCH2
- SRGAP2 and slower development
- Neoteny and human cranial development
- differences in expression of synaptic genes in the prefrontal cortex
Outline microcephaly
- Small (~430 cc v ~1,400 cc) but otherwise ~normal brain, some mental impairment
- Due to loss of activity of the ASPM gene or MCPH1 gene (among others)
- Doesn’t imply that these genes were involved in our evolution; but genetic disorders demonstrate breadth of potential genetic impacts
Outline the DUF1220 domain
- Many more copies in humans than other apes, association with brain size and various mental health disorders (Dumas et al, 2007)
- Neanderthals have most copies (~350, and biggest brains
outline NOTCH2
Suzuki et al 2018-
- Human/Denisovan/Neanderthals – duplications in NOTCH2 gene
* 1q21.1 distal deletion/duplication syndrome- micro/macrocephaly
* Evolutionary duplications
* Cellular mechanisms