complex diseases Flashcards

1
Q

monogenic diseases

A

those where there is a direct relationship between the disease gene and the disease status
Genotype and phenotype closely correlate (high penetrance) Variants CAUSE the disease (1 disease, 1 gene)
The traits presented so far are qualitative
= white eyed or red eyed flies
= cystic fibrosis or no cystic fibrosis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Quantitative traits

A

Traits with variation showing a
continuous range of phenotypes
e.g. human height, weight, colour, metabolic rate, behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

polygenic

A

Varying phenotypes result from input of many genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Multifactorial or complex traits

A

result of a combination of several genes and environmental factors

Complex (polygenic) diseases often show genetic predisposition, but individual genes only marginally affect disease status

Genotype and phenotype poorly correlate (low penetrance)

Variants PREDISPOSE to the disease (1 disease, many genes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

example of multifactorial inheritance

A
skin colour
additive effect
complex trait
- many genes
- environment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

single gene vs multifactorial

A

single gene

  • risk remains the same regardless if no. affected
  • if parent is carrier there is 1/2 risk
  • 1 child had disease the risk of another child is still 1/2

multifactorial

  • recurrent risk increases because the couple are high risk
  • if 1 child is affected, the recurrent risk is 1 in 25
  • if 2 children are affected, the recurrent risk is now 1 in 12
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Multifactorial disorders display familial clustering with no recognised pattern of Mendelian inheritance

A
  1. Most common cause of congenital malformations 2. Cause of many common acquired diseases
  2. More prevalent than single gene disorders
  3. Harder to find the genetic factors / causes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

not all polygenic traits show continuous variation

A

in large sample the data will reflect normal distribution
instead of using interval (so groups like age on x axis) we use number of predisposing alleles in genotype

there will be a certain point (threshold) where there is a higher frequency of disease. thus moving away from normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

3 types of polygenic traits

A

continuous traits

meristic traits
- phenotype can be recorded by counting integers

threshold traits

  • polygenic and often multifactorial
  • small number of discrete phenotypic classes
  • increasing number of diseases show this pattern
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

most common multifactorial diseases with a threshold

A
cleft lip
neural tube defect
congenital heart defect
asthma
diabetes
autism
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

multi-gene hypothesis

A
  1. A quantitative trait has continuous variation that can be quantified (measured)
  2. Two or more loci scattered in the genome account for the hereditary influence on the trait in an additive way
  3. Each gene locus is occupied by either an additive allele or a non- additive allele
  4. The contribution of each additive allele is approximately equal
  5. Together, the additive alleles contributing to a single quantitative character produce substantial phenotypic variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

calculating number of polygenes

A

Number of polygenes (n) contributing to quantitative trait is estimated based on ratio of F2 individuals resembling either of two extreme P phenotypes

  • 1/4n = ratio of F2 individuals expressing either extreme phenotype
  • For low number of polygenes: (2n + 1) = number of distinct phenotypic categories observed

i.e. 1 gene = 3 classes (1/4, 1/2, 1/4)
2 genes = 5 classes (1/16, 1/8, 1/4, 1/8, 1/16)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Heritability (H2)

A

the proportion of the total phenotypic
variance (VP) within a certain population that is due to genetic variance (VG) H2 = VG/VP

Different in different environments

A mean heritability estimate of 0.65 for human height does not mean that your height is 65% due to your genes, but rather that in the population sampled, on average, 65% of the overall variation in height could be explained by genotypic differences among individuals in that population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Familial

A

a trait shared by a family; they may not share the same genotype e.g. an adopted child speaks the same language as the rest of the family. This
is not heritable, because it is not genetic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Heritable

A

a trait shared by people with the same genotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If an environmental change affects all individuals in a population equally

A

the mean changes but the variance (heritability) stays the same

if the variance changes, the heritability changes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Gene-environment (G x E) interactions

A

interaction between genes and environment can play an important role in quantitative traits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

broad-sense heritability H2

A

Measures the proportion of the variance in a population within a single
generation that is due to genetic factors
Gives an estimate of 0 to 1

Low heritability = variation is due mainly to environmental effects
High heritability = variation is due mainly to genotypic effects
Ignores genotype-by-environment interactions

Includes genetic values due to dominance and epistasis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

additive gene action vs dominat gene action

A

for additive the homozygotes would be the two extremes and the heterozygote the intermediate

for dominant the homozygote are the two extremes and the heterozygote is the same as the dominant homozygote

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Narrow-sense heritability h2

A

only takes into account the fully additive genetic variants = all plant or animals wth desired trait are homozygote dominant

in dominant genetic variants the heterozygote is also desired so it would take longer for selective breeding

H2 = Va/ Vp 
Va = additive variants
Vp = total phenotypic variants
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How to quantify and interpret heritability

A

A common way to assess if a trait is heritable is to look for a correlation between the parents and the offspring.
Narrow-sense heritability (h2) = a measure of how heritable a trait is, using family data
This measurement is used in animal and plant breeding to determine if a population can be changed by selective breeding.
Estimate narrow heritability by comparing the offspring value against the averaged value for the two parents (midparent value).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How do we determine if a family

has a higher risk of disease?

A
  • Family members share a greater number of identical genetic variants than unrelated individuals
  • The degree of family clustering of a disease can be expressed by the relative risk ratio (λR)
  • Risk considers relative(s) (R) of an affected proband compared with the risk in the general population

relative risk ratio = disease prevalence in relatives R of probands / disease prevalence in population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Relative risk ratio interpretation

A

Higher λR values indicate greater proportion of risk in family compared to the population

Usually it increases with
• Increasing genetic contribution
• Decreasing population prevalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Familial clustering: the role of environment

A

Familial clustering confounded by shared environment

If familial aggregation is detected, it does not always and only mean genetics is the explanation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Twin studies

A

DZ (fraternal non identical, same as siblings)
MZ= identical twins

if a trait is genetic, it should always be the sam in MZ twins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

twin studies - concordance and discordance

A

Concordant twins*
Both affected (+ / +) or unaffected ( - / - )
Discordant twins
1 affected, 1 unaffected (+ / -)

concordance ratio (r) = concordance in MZ/ concordance in DZ
r> 1 genetics play a role
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

High concordance does not prove that a trait has a genetic component

A

Limitations of twin studies: DZ twins can be of different sex, MZ twins may share more environmental factors, there are also epigenetics factors along life, X-chromosome inactivation, post-zygotic somatic mutations, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Adoption studies

A

Two approaches:
• Find adopted people who suffer from a particular disease known to run in families and ask whether it runs in their biological or adoptive family
• Find affected parents whose children have been adopted away from the family and ask whether being adopted saved the children from the family disease

Main obstacles: lack of information about the biological family, when adoption happened, intrauterine factors, and selective placement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

linkage

A

property of loci
to identify biological mechanism for transmission of a trait
requires family pedigree
use polymorphic markers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

association

A

Association is a property of alleles

To identify an association between an allele and a phenotype

Fine mapping (<1cM)

Case-control or family approach

Usually bi-allelic SNPs

31
Q

linkage analysis in complex disease

A

affected sibling pair

When affected siblings share a chromosome region more or less often than expected by chance, then that region is likely involved in causing the disease

32
Q

limitations of linkage

A

for risk ratio of 4 (high) you would need a lot of pairs of families to do a linkage analysis
anything less than 4 and the number of families increased drastically

33
Q

successful linkage study - alzheimers

A

1991: Linkage analysis identified the proximal long arm of chromosome 19
• Apoliprotein E (APOE) • ε2 decreases risk
• ε4 increases risk
• 15-25% of the population carry 1 copy, 2-4% carry 2 copies
• ε4 drives earlier and more abundant amyloid pathology in the brains of carriers

34
Q

Most SNPs in a population are

A

rare

35
Q

Most SNPs in an individua, are

A

common

36
Q

Why most SNPs have neutral effect on phenotype?

A
  1. Functionally important DNA sequences are the minority of our genome.
  2. Genetic redundancy: nucleotide substitutions that don’t change amino acid, or gene duplication.
  3. Functionally unimportant amino acid or nucleotide positions within proteins or within functionally important noncoding sequences.
37
Q

Linkage disequilibrium

A

Chromosomal segments can exist as a block that is only rarely broken up by recombination.
- because theyre so close together they do not recombinate

• Linkage disequilibrium (LD): the nonrandom association of alleles of different loci.
some combinations of alleles are favoured

38
Q

calculating LD

A

frequency of haplotype (AB,Ab,aB,ab) - the frequency of the individual alleles

if no LD = frequency of haplotype = frequency of individual alleles multiplied together

if d’ = 1 complete linkage (no recombination)
d’>0.33 threshold to determine LD

39
Q

Haplotype

A

sets of nearby SNPs on the same chromosome that are inherited as a block.

Haplotype blocks represent ancestral chromosome segments that have been transmitted intact through many generations
- darker the blocks, the stronger the LD

the older the generation the SNPs were generated and transmitted together, the more consistent the haploid blocks are going to be

40
Q

Haplotypes are population-specific

A

similar ancestry, early on difference in mutations, then different haplotypes - the frequency of haplotypes depend on the population

41
Q

recombination hotspots

A

concentrated in 1-2kb hotspots
we have ~30,000 hotspots every 50-100kb

with low LD between blocks we have recombination hotspots

hotspots due to epigenetic histone methylation marker

42
Q

tag-SNPs

A

reduce the number of SNPs required to examine the entire genome for association with a phenotype
if SNPs are in LD they represent all the snps in that block

by taking a few tag SNPs we can identify the genotype of other snps around them

43
Q

determining if genotypes are phased cis and trans

A

Phasing: the process of inferring haplotypes from genotype data, assigning alleles to maternal or paternal chromosomes
if on same chromosome = cis (phased) on different = trans (unphased)

44
Q

Tag-SNPs: imputing

A

Using knowledge of linkage disequilibrium to fill in genotypes at loci that were not part of the original experiment.

45
Q

Tag-SNP imputation in practice

A

lets say you got 6 SNPS
- lets assume 1 and 2 are linked (i.e. d’ = 1)
- 3 and 5 linked
- 6 and 4 linked
we can just use 1, 3, 6 for single SNP tests

  • of lets say A from 1 and G from 3 always go together we can infer 6
46
Q

Association analyses in complex diseases

A

Looks for co-occurrence (association) of alleles and phenotypes

we use candidate gene studies (individual genes, require biological insight) and GWAS

47
Q

Candidate gene and association analysis in complex diseases

A

Looks for co-occurrence (association) of alleles and phenotypes, comparing cases and controls

e.g. we have two alleles T and C
in cases 62% have allele C and 38% have allele T
in control 49% have C and 51% have T

using odds ratio (axd/bxc)
calculate association

48
Q

Case-study: Identification of NARC1/PCSK9

candidate gene study

A

rare mutation in this gene strong correlation to high cardiovascular disease.
used linkage analysis followed by animal studies

when mutation, it binds to LDL receptor leading to lysosomal degradation of the receptor
the receptor cant bind to LDL –> high LDL- leads to clogging of arteries

trials to lower LDL cholesterol by targeting mutation with siRNA leads to mutation mRNA degredation

49
Q

Not all candidate gene studies were successful: Limitations

A
  • Inadequate matching of controls (not accounting for other factor)
    • Insufficient correction for multiple testing (bonferroni)
    • Underpowered studies leading to lack of replication
50
Q

Reasons for an association

A
  • Direct causation
  • Epistatic effect
  • Population stratification • Linkage disequilibrium
51
Q

benefits in identification of susceptibility variants

A

new biological insights -> clinical advances

  • therapeutic targets
  • biomarkers
  • prevention
52
Q

candidate genes vs whole gene

A

few SNPs + hypothesis

millions of SNPs and no hypothesis

53
Q

GWAS

A
A hypothesis-free method
• Uses large sample sizes, or cases versus
controls
• Identify regions of the human genome
that are associated with a phenotype
• Based on allele frequencies at
hundreds of thousands of tag-SNPs
• Association is usually confirmed
through replication in independent
datasets and/or GWAS meta-analyses
• Requires fine mapping through linkage
disequilibrium to identify specific variants
54
Q

Methods to generate genetic information for GWAS

A
SNP arrays vs WGS
- looks into tagSNP vs looking into the sequence of the whole genome
- inexpensive vs expensive
- reliable vs less accurate
-
55
Q

GWAS major steps

A
  • data collection
  • genotype (via SNP arrays and NGS)
  • quality control (look into different populations)
  • imputation (tag SNPs)
  • association testing (manhattan plot)
  • meta-analysis or replication
56
Q

GWAS major steps dependent on

A

It is dependent on a number of important factors, such as:
• (un)relatedness of individuals (if they share DNA there will be an unwanted association)

• genetic architecture
(quality control)
• population stratification
(quality control)
 • genetic model
57
Q

P-value threshold for GWAS

A

f we assume P<0.05 is significant:
In 100 comparisons, 5 associations will be a false positive
• Need to use a multiple comparison adjustment (e.g. Bonferroni) • GWAS, we do 1 million tests (or more!)
1,000,000 x 0.05 = 50,000 false positives

Estimated that P (for most GWAS) should be < 5 x 10-8 for common variants with MAF >5% and LD r2=0.8

58
Q

bonferroni

A
  1. 05 dived by number of comparisons made.

i. e. 1 million tests = 0.05/1,000,000

59
Q

Visualising GWAS results: Manhattan plot

A

threshold red line (normally 5 x 10^-8)
y-axis - adjusted p value threshold
x-axis - chromosome number

each dot represent a SNP based on its p value for association

the higher the p-value on the plot, potentially the highest the significance

for every dot, there is a SNO on a chromosome associated with the disease of interest

60
Q

in the past what chromosomes were not seen on GWAS

A

sex chromosomes

its now starting to improve

61
Q

case- inflammatory bowel disease

A

the monogenic alleles are few but large impact

the more complex the smaller but greater number of alleles

62
Q

where do we get the sample size

A

uk biobank - 500,000

there are many banks in Europe and America and Asia. few in Africa and other countries. demographic problem

63
Q

case - height

A

top 697 variants explain 20% of heritability

top 10k variants explain 30% of Vp

64
Q

case- blood pressure

A

heritability estimated to 30-70%

65
Q

Where is the missing heritability?

A
  1. due to rare variants with BIG effect
    1. Due to gene-gene and gene-environment interactions
    2. Due to epigenetic effects
    3. no missing heritability; family studies overestimate heritability
    4. GWAS underestimates heritability due to non reliable tag-SNP detecting variants
    5. Much heritability due to common variants with very small effects
66
Q

Whats next for complex disease

A
  • Whole-genome sequencing of large cohorts for rare. Uncommon variants
    Interpreting and role of risk of SNPS
67
Q

Genome-wide polygenic risk score

A

can identify individuals at risk of common complex diseases

68
Q

Polygenic risk score (PRS)

A
  • Single value estimate of an individuals genetic liability to a phenotype
    • Sum of the genome-wide genotypes, weighted by genotype effect size (odds ratio) derived from GWAS summary statistic data
69
Q

penetrance in complex diseases

A

GWAS - many variants with small effects - low penetrance
Mendelian - high penetrance - few variants large effect
the missing alleles could be the intermediate penetrance

70
Q

GWAS identified SNPs associated with X, now what?

A

we identify SNPs with GWAS associated with disease

estimate SNP based heritability and build candidate predictors

build polygenic risk scores

composite score for personalised risk prediction

71
Q

example of PRS

A
they identified 4 alleles on 4 loci with different effect
A- +1.5
C - -0.5
T - +2.0
A - -1.5

individual 1 has AT CG TT CC
1.5 (1x A) - 0.5(1x C) + 4.0(2x T) - 0.0 (0 x A0 = 5.0

72
Q

When are PRS beneficial?

A

For risk calculation in European populations = LIMITATION
• Conditions with proven preventative measures
• The risk of disease outweighs the psychological impact of knowing you are at high genetic risk of disease

73
Q

GWAS downstream analyses: Interpretation

A

causal variant genotyped = direct association

causal variant in LD with other genotyped variants = indirect association

74
Q

Moving from association to causation

A

Variants are merely associated with a trait
We can use further genomic analysis tools to determine:
• Coding vs regulatory
variants
• Fine mapping
• Gene expression
Future in vitro, animal studies, and clinical trials