Demography and Genetic Variation (first 4) Flashcards
Define demography
The study of the age structure and growth of populations. Encompasses current stock of human population (age/sex structure) and the flow of vital events (births/deaths/migrations).
Define crude birth rate. What is the world average? What is the UK average?
Number of births per unit of person-time.
20 births per 1000 person/years
UK: 10.7 per 1000 person years
Define crude death rate. What is the world average?
Number of deaths per unit of person-time. 9 per 1000 person/years
Define crude rate of natural increase/decrease. What is the world average?
Crude birth rate - crude death rate. 11 per 1000 person years.
Define total fertility rate. What is it in the UK?
Total number of babies per average reproductive lifetime. 2 per woman.
Define gross reproductive rate. What is its relationshio to total fertility rate. What is it in the UK?
Total number of daughters per average reproductive lifetime.
Gross reproductive rate = proportion of female births * total fertility rate.
Sex ratio at birth:105 boys to 100 girls
100/(100+1.05) * total fertility rate = 0.98 daughters per women’s reproductive lifetime.
What is the infant mortality rate? What is it in western vs low income countries?
Probability of dying before the first birthday.
Western less than 1 per 1000 live births.
Low income greater than 100 per 1000 live births.
What is the child mortality rate?
Probability of dying before the age of 5
What is the adult mortality rate?
Probability of dying between 15 and 60 years.
What do you need to do to compare birth and death rates?
Age standardise
What is the age-standardised death rate?
Overall death rate of the population of interest after removing the effect of age. Estimate by calculating a weighted average of the age-specific death rates in the population of interest, where the weights represent the contribution of each age stratum in a standard population.
In a life table, how do you calculate number dying during age interval?
Number living at beginning of age interval * proportion dying during age interval
In a life table, how do you calculate number of person years of life lived in age interval?
Number living at beginning of age interval - 0.5(number dying during age interval).
Assume deaths to occur linearly within each age year. Can’t do this for ages 0-1.
In a life table, how do you calculate the number of person years of life lived in this and all subsequent age intervals?
Sum of number of person-years of life (L(x)) lived in age interval from age x to final row.
How do you calculate the life expectancy at the beginning of an age interval in a life table?
e(x)= T(x) / l(x)
T(x) = number of person years of life lived in this and all subsequent age intervals L(x) = number living at beginning of age interval
Define risk
Probability of an event occurring over a specific amount of time
Define rate
Ratio - measure of frequency per unit time.
Define net reproductive rate
Average number of daughters per mother expected to survive to reproductive age.
(Birth cohort of girls - expected number of deaths)
What information do you need to be given to calculate a life table?
Proportion dying during different age intervals
What are the limitations of life tables?
Usually based on estimated age-specific mortality rates in recent past. Projections not representative due to current advances in medicine, public health and safety standards which did not exist in the early years of the cohort.
What is health expectancy?
Remaining number of years a person can expect to live in a specific health state
What are Omaran’s three typical phases of transition?
- Age of pestilence and famine: high mortality, low life expectancy (20-4), increase in infectious diseases, lots of dietary deficiencies.
- Age of receding pandemics: declining mortality, rising life expectancy (30-50). Shift from infectious to chronic diseases. Transition period.
- Age of degenerative and man-made diseases. Low mortality, high life expectancy (50+). Late 19th-20th century. Infectious disease transferred to chronic disease.
When were the biggest improvements in mortality risk?
Children = early 20th century reduced by 50% Adult = late 20th century
Why did the rapid improvement in child survival in early C20 England and US happen?
General diffusion of useful knowledge including domestic medicine
Why did food improvements not contribute to the rapid improvement of child survival?
In England there were already laws in place to stop people not getting enough food (workhouses).
Which other factor was mentioned as not the reason child survival improved?
Professional medicine
What is the name of the graph showing the changing relationship between life expectancy and income during C20? What is the major observation?
Preston curve
1. Rising income doesn’t account for major gains in life expectancy at birth
What did the census at the beginning of the C20 show?
- Not a strong relationship between social rank and child survival (under 5 mortality rate) at the beginning of the 20th century
- Mortality more related to residence than social class
Why did education not help raise child mortality rate?
Education only helps if it enables you to make better use of what is known.
Give an example of new knowledge around the turn of the 20th century?
in 1889 thought diarrhoea was caused by sour milk/unripe fruit/inhalation of sewer gas/emanations from the soil. In 1899 ‘no doubt that the immediate cause is an infection of the alimentary canal, by bacteria contained in milk or other forms of food’
What were the two major transitions in mortality risks in adults?
The decline in TB
The decline in tobacco-caused disease (the single most important contributor)
Why did TB decline?
Developed streptomycin
Why did tobacco-caused disease fall?
- New scientific evidence
- Number of articles on smoking and health rose
- Percentage of population that believed smoking caused lung cancer rose
- People expected doctors and the government to give lifestyle advice
What are the current failures in life expectancy?
- Unequal success across social strata
- Unequal success between countries
- Less access to healthcare= less knowledge
What are the main contributors to health inequalities currently?
Vascular diseases
Tobacco-caused disease
How many nucleotides in the human genome?
3 billion
How many genes in the human genome, and thus what proportion of the genome are genes?
20,000, so 2%
Which groups of organisms have the largest genome size?
Protists, plants (over 100Gb)
What proportion of genetic content do humans share?
95%
What proportion of genetic content do humans share? What proportion do they share excluding structural variants?
95%
99.9%
Define allele
Different versions of a genetic locus
Describe multiple genome alignment
Multiple genomes
Arbitrarily take one sequence as reference sequence
Count the differences between one genome and another
How can you represent multiple genome alignment?
Make a genetic distance matrix (table according to the pairwise comparisons)
Triangle with lengths proportional to genetic distance for 3 people
3D shape for 4x4 matrix or higher. Can do with principal component analysis.
Where in the world is there most genetic diversity?
More within Africa than within all of the non-Africans
Where in the world is there most genetic diversity?
More within Africa than within all of the non-Africans
What are genetic trees?
A representation of an evolutionary timescale with differences between individuals corresponding to mutations which have occurred since the most recent common ancestor.
How can you use genetic trees?
If you assume mutation rate = 0.5x10-9/bp/yr
Count number of mutations
Work out how far back two species diverged
What does fossil evidence show?
Humans colonised the rest of the world from Africa
Mated with archaic ancestors of humans
Means that there isn’t really a tree of human populations but a tangled web of interactions
Describe the processes of human evolution
- Natural selection: positive, negative or balancing
- Germline mutation
- Genetic drift
- Demography (influences strength of drift and efficacy of selection)
What is negative selection otherwise known as?
Purifying selection
Which model do we use to look at effects of population size on genetic drift? Explain
Wright-Fisher model
Plot generation x vs allele freq y
With no mutation and no selection pressure, over time a small population will often fix at 100% or 0% allele freq (genetic drift larger)
With a larger population, most stay about 50%.
Define genetic drift
Random changes in allele frequency from one generation to the next.
Stronger in smaller populations, leading to a loss of genetic diversity
Describe the founder effect
Founder population is a much smaller gene pool than the original population. Can alter allele proportions
Give 4 examples of deleterious pathogenic alleles persisting due to small and isolated populations
- Amish, Hutterite and Mennonite communities in N America. Increased prevalence of microcephaly and other genetic diseases.
- Tay-Sachs disease in Ashkenazi Jews
- Finnish Disease Heritage of 36 monogenic diseases
- Ireland - Coeliac disease
What is Tay-Sachs disease?
An inherited metabolic disorder in which certain lipids accumulate in the brain, causing spasticity and death in childhood
Give 3 categories of disease and examples
- Mendelian e.g. CF, Down, Sickle-Cell, Charcot-Marie-Tooth
- Complex, influenced by both environment and ancestry e.g. Alzheimer’s, cardiovascular disease, type II diabetes, Parkinson’s
- Environmental diseases. Often infections, genetic component with regards to susceptibility e.g. flu, hepatitis, measles
What is the risk odds ratio?
Allele effect on risk of developing a disease
OR = P(disease/carrying allele)/P(no disease/carrying allele) over P(disease/not carrying allele)/P(no disease/not carrying allele)
OR>1 increased risk
OR<1 reduced risk (protective)
OR 1 no effect
How do you identify low frequency alleles with large effects?
Linkage analysis - should find family carrying the same allele
How do you identify high frequency alleles with small effects?
GWAS
Describe GWAS
Genome wide association studies
Collect controls and cases
Sequence whole genome
Look for variants that tend to be more associated with cases than controls
Which test can we use to check if there is an association between a disease and each allele in GWAS?
Chi squared test
What is the chi squared equation?
Chi squared = sum (O-E)2/E
E = number of people * allele freq
What are the two approaches to using chi squared information?
- Ranking approach: chi squared values calculated for many loci across the genome. Loci ranked in order of chi squared and top loci considered for further investigation.
- Significance threshold approach. Convert chi squared values to p values. Set a genome wide significance threshold (typically 10^-8) and consider all loci with p values less than this).
What does P value mean?
Probability of seeing a particular (chi squared) value or greater by chance even if no association.
What are the caveats of GWAS?
Computationally intensive - require large sample sizes to achieve sufficient statistical power.
Population structure can lead to false signals of association, i.e. if cases and controls have different genetic ancestry.
Why are there lots of GWAS hits on the short arm of chromosome 6?
Immune system here!
What were the hoped for benefits of GWAS?
Identification of susceptibility variants
- New biological insights –> clinical advances in therapeutic targets/biomarkers/prevention
- Improved measures of individual aetiological processes –> personalised medicine –> diagnostics/prognostics/therapeutic optimisation
Why may GWAS not fully deliver?
Most common diseases and traits are highly polygenic, so much of heritability may lie in genetic loci not deemed statistically significant.
Natural selection limits the frequency of variants with large effect size.
Why may genes with small effect size be promising targets for drug development?
A drug targeting the associated gene may perturb it more than the GWAS variant does, and hence may have a larger effect on the gene.
How do you estimate mutation rate?
- Dominant disease loci
- Consider dominant disease-causing gene. Count the number x cases in N families where a child is affected by but neither parent is. Gives estimate of per-generation mutation rate at that locus (mu locus = x/2N) - Phylogenetic comparison
- Compare sequence divergence between species against the estimated time since speciation (using fossils). Mu = D/2T mutation rate per year - Direct sequencing
- Observe de novo mutations as differences between genome sequences of parents and offspring
What is the directly measured rate of mutation?
1.75x10-8 per bp per generation, half that estimated from phylogenetic comparison
What is the paternal age effect?
Older fathers contribute more de novo mutations