Lecture 16 - human genetics Flashcards

1
Q

What are different possible architecture of complex diseases?

A
  • small of dominant alleles confer a large increase in risk
  • common disease, common variant model - many alleles confer a small increase in risk
  • intermediate - one major allele exerts a large effect, numerous other lower risk alleles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are single nucleotide polymorphism (SNPs)?

A
  • between 11-15 million common SNPs (Minor Allele Frequency >5%)
  • uneven distribution of SNPs in the genome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Where do SNPs occur at?

A
  • coding regions
  • non-coding regions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are coding regions?

A
  • synonymous (no change in encoded amino acid)
  • non-synonymous (e.g. missense or nonsense mutation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are non-coding regions?

A
  • can affect expression/regulation of associated genes
  • complex diseases arise from combinations of multiple SNPs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the exome aggregation consortium (ExAC)?

A
  • exomes from unrelated individuals sequences as part of various disease-specific and population genetic studies
  • 7.4 million variants mapped
  • records frequency of alleles in a population
  • records frequency of alleles in a population
  • documents rare mutations
  • highly pathogenic variants seen with a lower frequency in the general population
  • gnomAD aggregates over 125,000 exome and 15,000 genome datasets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are genome wide association studies?

A

Population-based studies looking at individuals with a condition against a control population

  • examine a panel of SNPs in the genome for association with the disease phenotype
  • search for alleles that occur more frequently in disease cases than in matched controls
  • requires many participants
  • GWAS studies have been performed for most common diseases
  • many risk loci have yet to be identified
  • missing loci contribute to the ‘missing heritability problem’
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What did the genome wide association studies do?

A
  • GWAS compare the allelic frequency across the entire genome in case and control populations
  • significant differences in allelic frequency constitutes an association with disease
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are SNPs associated with disease?

A
  • association studies can tell us if an allele is associated with a disease
  • the SNP itself
  • the SNP correlates with the risk allele due to linkage disequilibrium
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is linkage disequilibrium?

A

the non-random association of alleles at different genomic sites

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does linkage disequilibrium depend on?

A
  • distance between alleles
  • recombination rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What can patterns of Linkage Equilibrium be summarised as?

A

Haplotype blocks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are haplotype blocks?

A

regions of high linkage disequilibrium that are separated from other haplotype blocks by many historical recombination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What occurs in haplotype mapping?

A

groups of alleles are clustered so a single SNP can identify the cluster of alleles (Tag SNP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does identification of risk alleles occur?

A
  • GWAS studies identify SNPs associated with disease, not necessarily risk alleles
  • need integration with functional data on candidate regions to identify causality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How are alleles associated to the disease?

A
  • the likelihood of an SNP being associated with a disease is measured in an odds ratio (OR)
17
Q

What is an odds ratio (OR)?

A

is a statistic that quantifies the strength of the association between 2 events:

OR = 1 events are independent
OR > 1 events are correlated
OR < 1 events are negatively correlated

Common disease common variant (CDCV) model of complex disease - multiple alleles with OR<1.2 showing weak association to the disease phenotype

18
Q

Explain how GWAS relies on statistical analysis & large cohorts

A
  • statistical significance is needed to differentiate true positives from false positives
  • genome wide significance is where p value <5 x 10^-8
  • 1 in 20 events are non-significant (nominal significance = 0.05)
  • for 1 million SNPs expect 50,000 false positives
  • very large number of participants are required
19
Q

How are risk variants defined?

A
  • Manhattan plot
  • Threshold for significance is shown by the red horizontal line
  • 44 risk loci defined with significant p-values (green stacks)
  • GWAS susceptible to high number of false negatives
20
Q

Describe type 2 diabetes

A
  • common chronic condition caused by an inability to take up sugar
  • characterised by high blood sugar, insulin resistance and a lack of insulin production
  • diabetes is multifactorial (genetic & environmental)
  • familial
  • geography & ethnicity
  • age, weight, diet level of physical activity
21
Q

Describe the GWAS of type 2 diabetes

A
  • previously identified common risk alleles (Red SNPs)
  • novel association loci determined due to increased statistical power (green SNPs)
  • these novel loci have low Odds ratio (1.06-1.27) with each causing only a small increase in risk
22
Q

What are the 3 alleles identified by the GWAS of type 2 diabetes?

A
  • TCF7L2
  • FTO
  • CDKN2A/B
23
Q

What does TCF7FL2 do?

A
  • the alleles providing the greatest risk of type 2 diabetes
  • intronic variant
  • transcription factor required for pancreatic development
24
Q

What does FTO do?

A
  • intronic variant
  • involved in body weight regulation
25
Q

What does CDKN2A/B do?

A
  • non-coding regulatory variant
26
Q

Describe the features of breast cancer

A
  • lifetime risk of breast cancer is 8-12% in females
  • the risk of disease increases if first-degree relatives suffer from condition
  • rare coding mutations and common non-coding variants increase risk of disease
  • rare coding mutations e.g. BRCA1 & BRCA2 greatly increase the risk
  • common variants in polygenes contribute small increase in risk
  • 5% of breast cancer cases are due to BRCA1/BRCA2 autosomal dominant alleles
  • BRCA1 & BRCA2 were mapped by linkage analysis
27
Q

What did a GWAS of breast cancer identify?

A
  • identify high known high-risk factors, that occur infrequently in the population
  • identify 66 common low-risk alleles may of which are in the non-coding region of the genome
28
Q

What is heritability?

A

the proportion of variance in a particular phenotype in a population that is due to genetic variation
- we can identify risk alleles for complex diseases, but heritability is not fully explained by these alleles

29
Q

What can missing risk arise from?

A
  • false negatives in GWAS studies
  • rare variant alleles with a Minor Allele Frequency (MAF) 1-5%
  • structural alternation of the genome
  • epigenetics
  • 3D genome organisation