complex diseases Flashcards

Question 1

Q

monogenic diseases

Answer

A

those where there is a direct relationship between the disease gene and the disease status
Genotype and phenotype closely correlate (high penetrance) Variants CAUSE the disease (1 disease, 1 gene)
The traits presented so far are qualitative
= white eyed or red eyed flies
= cystic fibrosis or no cystic fibrosis

Question 2

Q

Quantitative traits

Answer

A

Traits with variation showing a
continuous range of phenotypes
e.g. human height, weight, colour, metabolic rate, behaviour

Question 3

Q

polygenic

Answer

A

Varying phenotypes result from input of many genes

Question 4

Q

Multifactorial or complex traits

Answer

A

result of a combination of several genes and environmental factors

Complex (polygenic) diseases often show genetic predisposition, but individual genes only marginally affect disease status

Genotype and phenotype poorly correlate (low penetrance)

Variants PREDISPOSE to the disease (1 disease, many genes)

Question 5

Q

example of multifactorial inheritance

Answer

A

skin colour
additive effect
complex trait
- many genes
- environment

Question 6

Q

single gene vs multifactorial

Answer

A

single gene

risk remains the same regardless if no. affected
if parent is carrier there is 1/2 risk
1 child had disease the risk of another child is still 1/2

multifactorial

recurrent risk increases because the couple are high risk
if 1 child is affected, the recurrent risk is 1 in 25
if 2 children are affected, the recurrent risk is now 1 in 12

Question 7

Q

Multifactorial disorders display familial clustering with no recognised pattern of Mendelian inheritance

Answer

A

Most common cause of congenital malformations 2. Cause of many common acquired diseases
More prevalent than single gene disorders
Harder to find the genetic factors / causes

Question 8

Q

not all polygenic traits show continuous variation

Answer

A

in large sample the data will reflect normal distribution
instead of using interval (so groups like age on x axis) we use number of predisposing alleles in genotype

there will be a certain point (threshold) where there is a higher frequency of disease. thus moving away from normal distribution

Question 9

Q

3 types of polygenic traits

Answer

A

continuous traits

meristic traits
- phenotype can be recorded by counting integers

threshold traits

polygenic and often multifactorial
small number of discrete phenotypic classes
increasing number of diseases show this pattern

Question 10

Q

most common multifactorial diseases with a threshold

Answer

A

cleft lip
neural tube defect
congenital heart defect
asthma
diabetes
autism

Question 11

Q

multi-gene hypothesis

Answer

A

A quantitative trait has continuous variation that can be quantified (measured)
Two or more loci scattered in the genome account for the hereditary influence on the trait in an additive way
Each gene locus is occupied by either an additive allele or a non- additive allele
The contribution of each additive allele is approximately equal
Together, the additive alleles contributing to a single quantitative character produce substantial phenotypic variation

Question 12

Q

calculating number of polygenes

Answer

A

Number of polygenes (n) contributing to quantitative trait is estimated based on ratio of F2 individuals resembling either of two extreme P phenotypes

1/4n = ratio of F2 individuals expressing either extreme phenotype
For low number of polygenes: (2n + 1) = number of distinct phenotypic categories observed

i.e. 1 gene = 3 classes (1/4, 1/2, 1/4)
2 genes = 5 classes (1/16, 1/8, 1/4, 1/8, 1/16)

Question 13

Q

Heritability (H2)

Answer

A

the proportion of the total phenotypic
variance (VP) within a certain population that is due to genetic variance (VG) H2 = VG/VP

Different in different environments

A mean heritability estimate of 0.65 for human height does not mean that your height is 65% due to your genes, but rather that in the population sampled, on average, 65% of the overall variation in height could be explained by genotypic differences among individuals in that population.

Question 14

Q

Familial

Answer

A

a trait shared by a family; they may not share the same genotype e.g. an adopted child speaks the same language as the rest of the family. This
is not heritable, because it is not genetic.

Question 15

Q

Heritable

Answer

A

a trait shared by people with the same genotype

Question 16

Q

If an environmental change affects all individuals in a population equally

Answer

A

the mean changes but the variance (heritability) stays the same

if the variance changes, the heritability changes

Question 17

Q

Gene-environment (G x E) interactions

Answer

A

interaction between genes and environment can play an important role in quantitative traits

Question 18

Q

broad-sense heritability H2

Answer

A

Measures the proportion of the variance in a population within a single
generation that is due to genetic factors
Gives an estimate of 0 to 1

Low heritability = variation is due mainly to environmental effects
High heritability = variation is due mainly to genotypic effects
Ignores genotype-by-environment interactions

Includes genetic values due to dominance and epistasis

Question 19

Q

additive gene action vs dominat gene action

Answer

A

for additive the homozygotes would be the two extremes and the heterozygote the intermediate

for dominant the homozygote are the two extremes and the heterozygote is the same as the dominant homozygote

Question 20

Q

Narrow-sense heritability h2

Answer

A

only takes into account the fully additive genetic variants = all plant or animals wth desired trait are homozygote dominant

in dominant genetic variants the heterozygote is also desired so it would take longer for selective breeding

H2 = Va/ Vp 
Va = additive variants
Vp = total phenotypic variants

Question 21

Q

How to quantify and interpret heritability

Answer

A

A common way to assess if a trait is heritable is to look for a correlation between the parents and the offspring.
Narrow-sense heritability (h2) = a measure of how heritable a trait is, using family data
This measurement is used in animal and plant breeding to determine if a population can be changed by selective breeding.
Estimate narrow heritability by comparing the offspring value against the averaged value for the two parents (midparent value).

Question 22

Q

How do we determine if a family

has a higher risk of disease?

Answer

A

Family members share a greater number of identical genetic variants than unrelated individuals
The degree of family clustering of a disease can be expressed by the relative risk ratio (λR)
Risk considers relative(s) (R) of an affected proband compared with the risk in the general population

relative risk ratio = disease prevalence in relatives R of probands / disease prevalence in population

Question 23

Q

Relative risk ratio interpretation

Answer

A

Higher λR values indicate greater proportion of risk in family compared to the population

Usually it increases with
• Increasing genetic contribution
• Decreasing population prevalence

Question 24

Q

Familial clustering: the role of environment

Answer

A

Familial clustering confounded by shared environment

If familial aggregation is detected, it does not always and only mean genetics is the explanation

Question 25

Q

Twin studies

Answer

A

DZ (fraternal non identical, same as siblings)
MZ= identical twins

if a trait is genetic, it should always be the sam in MZ twins

Question 26

Q

twin studies - concordance and discordance

Answer

A

Concordant twins*
Both affected (+ / +) or unaffected ( - / - )
Discordant twins
1 affected, 1 unaffected (+ / -)

concordance ratio (r) = concordance in MZ/ concordance in DZ
r> 1 genetics play a role

Question 27

Q

High concordance does not prove that a trait has a genetic component

Answer

A

Limitations of twin studies: DZ twins can be of different sex, MZ twins may share more environmental factors, there are also epigenetics factors along life, X-chromosome inactivation, post-zygotic somatic mutations, etc

Question 28

Q

Adoption studies

Answer

A

Two approaches:
• Find adopted people who suffer from a particular disease known to run in families and ask whether it runs in their biological or adoptive family
• Find affected parents whose children have been adopted away from the family and ask whether being adopted saved the children from the family disease

Main obstacles: lack of information about the biological family, when adoption happened, intrauterine factors, and selective placement

Question 29

Q

linkage

Answer

A

property of loci
to identify biological mechanism for transmission of a trait
requires family pedigree
use polymorphic markers

Question 30

Q

association

Answer

A

Association is a property of alleles

To identify an association between an allele and a phenotype

Fine mapping (<1cM)

Case-control or family approach

Usually bi-allelic SNPs

Question 31

Q

linkage analysis in complex disease

Answer

A

affected sibling pair

When affected siblings share a chromosome region more or less often than expected by chance, then that region is likely involved in causing the disease

Question 32

Q

limitations of linkage

Answer

A

for risk ratio of 4 (high) you would need a lot of pairs of families to do a linkage analysis
anything less than 4 and the number of families increased drastically

Question 33

Q

successful linkage study - alzheimers

Answer

A

1991: Linkage analysis identified the proximal long arm of chromosome 19
• Apoliprotein E (APOE) • ε2 decreases risk
• ε4 increases risk
• 15-25% of the population carry 1 copy, 2-4% carry 2 copies
• ε4 drives earlier and more abundant amyloid pathology in the brains of carriers

Question 34

Q

Most SNPs in a population are

Question 35

Q

Most SNPs in an individua, are

Question 36

Q

Why most SNPs have neutral effect on phenotype?

Answer

A

Functionally important DNA sequences are the minority of our genome.
Genetic redundancy: nucleotide substitutions that don’t change amino acid, or gene duplication.
Functionally unimportant amino acid or nucleotide positions within proteins or within functionally important noncoding sequences.

Question 37

Q

Linkage disequilibrium

Answer

A

Chromosomal segments can exist as a block that is only rarely broken up by recombination.
- because theyre so close together they do not recombinate

• Linkage disequilibrium (LD): the nonrandom association of alleles of different loci.
some combinations of alleles are favoured

Question 38

Q

calculating LD

Answer

A

frequency of haplotype (AB,Ab,aB,ab) - the frequency of the individual alleles

if no LD = frequency of haplotype = frequency of individual alleles multiplied together

if d’ = 1 complete linkage (no recombination)
d’>0.33 threshold to determine LD

Question 39

Q

Haplotype

Answer

A

sets of nearby SNPs on the same chromosome that are inherited as a block.

Haplotype blocks represent ancestral chromosome segments that have been transmitted intact through many generations
- darker the blocks, the stronger the LD

the older the generation the SNPs were generated and transmitted together, the more consistent the haploid blocks are going to be

Question 40

Q

Haplotypes are population-specific

Answer

A

similar ancestry, early on difference in mutations, then different haplotypes - the frequency of haplotypes depend on the population

Question 41

Q

recombination hotspots

Answer

A

concentrated in 1-2kb hotspots
we have ~30,000 hotspots every 50-100kb

with low LD between blocks we have recombination hotspots

hotspots due to epigenetic histone methylation marker

Question 42

Q

tag-SNPs

Answer

A

reduce the number of SNPs required to examine the entire genome for association with a phenotype
if SNPs are in LD they represent all the snps in that block

by taking a few tag SNPs we can identify the genotype of other snps around them

Question 43

Q

determining if genotypes are phased cis and trans

Answer

A

Phasing: the process of inferring haplotypes from genotype data, assigning alleles to maternal or paternal chromosomes
if on same chromosome = cis (phased) on different = trans (unphased)

Question 44

Q

Tag-SNPs: imputing

Answer

A

Using knowledge of linkage disequilibrium to fill in genotypes at loci that were not part of the original experiment.

Question 45

Q

Tag-SNP imputation in practice

Answer

A

lets say you got 6 SNPS
- lets assume 1 and 2 are linked (i.e. d’ = 1)
- 3 and 5 linked
- 6 and 4 linked
we can just use 1, 3, 6 for single SNP tests

of lets say A from 1 and G from 3 always go together we can infer 6

Question 46

Q

Association analyses in complex diseases

Answer

A

Looks for co-occurrence (association) of alleles and phenotypes

we use candidate gene studies (individual genes, require biological insight) and GWAS

Question 47

Q

Candidate gene and association analysis in complex diseases

Answer

A

Looks for co-occurrence (association) of alleles and phenotypes, comparing cases and controls

e.g. we have two alleles T and C
in cases 62% have allele C and 38% have allele T
in control 49% have C and 51% have T

using odds ratio (axd/bxc)
calculate association

Question 48

Q

Case-study: Identification of NARC1/PCSK9

candidate gene study

Answer

A

rare mutation in this gene strong correlation to high cardiovascular disease.
used linkage analysis followed by animal studies

when mutation, it binds to LDL receptor leading to lysosomal degradation of the receptor
the receptor cant bind to LDL –> high LDL- leads to clogging of arteries

trials to lower LDL cholesterol by targeting mutation with siRNA leads to mutation mRNA degredation

Question 49

Q

Not all candidate gene studies were successful: Limitations

Answer

A

Inadequate matching of controls (not accounting for other factor)
• Insufficient correction for multiple testing (bonferroni)
• Underpowered studies leading to lack of replication

Question 50

Q

Reasons for an association

Answer

A

Direct causation
Epistatic effect
Population stratification • Linkage disequilibrium

Question 51

Q

benefits in identification of susceptibility variants

Answer

A

new biological insights -> clinical advances

therapeutic targets
biomarkers
prevention

Question 52

Q

candidate genes vs whole gene

Answer

A

few SNPs + hypothesis

millions of SNPs and no hypothesis

Question 53

Q

GWAS

Answer

A

A hypothesis-free method
• Uses large sample sizes, or cases versus
controls
• Identify regions of the human genome
that are associated with a phenotype
• Based on allele frequencies at
hundreds of thousands of tag-SNPs
• Association is usually confirmed
through replication in independent
datasets and/or GWAS meta-analyses
• Requires fine mapping through linkage
disequilibrium to identify specific variants

Question 54

Q

Methods to generate genetic information for GWAS

Answer

A

SNP arrays vs WGS
- looks into tagSNP vs looking into the sequence of the whole genome
- inexpensive vs expensive
- reliable vs less accurate
-

Question 55

Q

GWAS major steps

Answer

A

data collection
genotype (via SNP arrays and NGS)
quality control (look into different populations)
imputation (tag SNPs)
association testing (manhattan plot)
meta-analysis or replication

Question 56

Q

GWAS major steps dependent on

Answer

A

It is dependent on a number of important factors, such as:
• (un)relatedness of individuals (if they share DNA there will be an unwanted association)

• genetic architecture
(quality control)
• population stratification
(quality control)
 • genetic model

Question 57

Q

P-value threshold for GWAS

Answer

A

f we assume P<0.05 is significant:
In 100 comparisons, 5 associations will be a false positive
• Need to use a multiple comparison adjustment (e.g. Bonferroni) • GWAS, we do 1 million tests (or more!)
1,000,000 x 0.05 = 50,000 false positives

Estimated that P (for most GWAS) should be < 5 x 10-8 for common variants with MAF >5% and LD r2=0.8

Question 58

Q

bonferroni

Answer

A

05 dived by number of comparisons made.

i. e. 1 million tests = 0.05/1,000,000

Question 59

Q

Visualising GWAS results: Manhattan plot

Answer

A

threshold red line (normally 5 x 10^-8)
y-axis - adjusted p value threshold
x-axis - chromosome number

each dot represent a SNP based on its p value for association

the higher the p-value on the plot, potentially the highest the significance

for every dot, there is a SNO on a chromosome associated with the disease of interest

Question 60

Q

in the past what chromosomes were not seen on GWAS

Answer

A

sex chromosomes

its now starting to improve

Question 61

Q

case- inflammatory bowel disease

Answer

A

the monogenic alleles are few but large impact

the more complex the smaller but greater number of alleles

Question 62

Q

where do we get the sample size

Answer

A

uk biobank - 500,000

there are many banks in Europe and America and Asia. few in Africa and other countries. demographic problem

Question 63

Q

case - height

Answer

A

top 697 variants explain 20% of heritability

top 10k variants explain 30% of Vp

Question 64

Q

case- blood pressure

Answer

A

heritability estimated to 30-70%

Answer 63

A

due to rare variants with BIG effect
1. Due to gene-gene and gene-environment interactions
2. Due to epigenetic effects
3. no missing heritability; family studies overestimate heritability
4. GWAS underestimates heritability due to non reliable tag-SNP detecting variants
5. Much heritability due to common variants with very small effects

Answer 64

A

Whole-genome sequencing of large cohorts for rare. Uncommon variants
Interpreting and role of risk of SNPS

Answer 65

A

can identify individuals at risk of common complex diseases

Answer 66

A

Single value estimate of an individuals genetic liability to a phenotype
- Sum of the genome-wide genotypes, weighted by genotype effect size (odds ratio) derived from GWAS summary statistic data

Answer 67

A

GWAS - many variants with small effects - low penetrance
Mendelian - high penetrance - few variants large effect
the missing alleles could be the intermediate penetrance

Answer 68

A

we identify SNPs with GWAS associated with disease

estimate SNP based heritability and build candidate predictors

build polygenic risk scores

composite score for personalised risk prediction

Answer 69

A

they identified 4 alleles on 4 loci with different effect
A- +1.5
C - -0.5
T - +2.0
A - -1.5

individual 1 has AT CG TT CC
1.5 (1x A) - 0.5(1x C) + 4.0(2x T) - 0.0 (0 x A0 = 5.0

Answer 70

A

For risk calculation in European populations = LIMITATION
• Conditions with proven preventative measures
• The risk of disease outweighs the psychological impact of knowing you are at high genetic risk of disease

Answer 71

A

causal variant genotyped = direct association

causal variant in LD with other genotyped variants = indirect association

Answer 72

A

Variants are merely associated with a trait
We can use further genomic analysis tools to determine:
• Coding vs regulatory
variants
• Fine mapping
• Gene expression
Future in vitro, animal studies, and clinical trials