Unit 2 Day 5 Flashcards
what is the reality of “simple Mendelian” disease characteristics?
- variable disease progression depending on other factors common
- different alleles in same gene associated with varying levels of severity
- 1:1 relationship btw. variant and disease (e.g.: cystic fibrosis, huntingtons)
multifactorial inheritance
- result of interactions between multiple variants and non-genetic factors
- majority is a combo of genetic variation and non-genetic factors
complex traits of multifactorial inheritance
- complex traits aggregate in families
- don’t follow mendelian inheritance
- need to distinguish between familial clustering and shared environmental factors
twin studies
- monozygotic vs. dizygotic twins
- if twins raised together-same degree of similar environment. differences=genetic
- if twins raised separate-different environments. similarities=genetic
adoption studies
- compare similarity btw. biological siblings raised apart and adopted siblings
- if biological siblings are more concordant than adoptive=genetic as opposed to environment
risk of disease in relatives
- compare frequency of disease in patients to see if higher than in general population
- risk of disease in siblings of affected/risk of disease in gen. pop
heritability
proportion of variance in trait that is due to genetic variation
-eg: diabetes. 20% of pop has high risk haplotype, but disease incidence is 4%
incomplete penetrance
some with genotype will not get the trait (reality for complex traits)
penetrance
relationship btw. trait and genotype
probability that an individual develops trait if have genotype
complete penetrance
everyone with pre-disposing genotype will get trait
variable expressivity
individuals w/ same variant don’t show precisely the same disease or quantitative phenotype charachteristics
allelic heterogeneity
- different alleles in same gene result in same trait (CF)
- different alleles in same gene result in diff. traits (diff organ involvement w/in CF)
- many alleles appear to have similar clinical progression
- grouped into classes
CFTR genotype
2 copies=severe pancreatic insufficiency
1/0 copies=mild, mild sever insufficiency
locus heterogeneity
- variants in different genes in very similar presentations
- eg: early onset AD (mutations in 3 genes: PS2, 1, APP; all result in early onset AD)
phenocopy
- environmentally caused phenotype, mimics genetic version of trait
- thalidomide induced limb malformation
why find disease genes?
- genes/environment play role in all diseases
- no systemic way to discover enviro risk factors, can find gene diseases
- provides clues on pathogenesis (may allow new treatment)
- enable genetic testing/screening/surveillance
personalized medicine program
- discover risk genes common diseases
- create DNA based predictive diagnostics
- apply optimized treatments based on genetics
problems w/ personalized medicine
- genes and environment play role
- genes for Mendelian (single gene) disorders=are deterministic
- most genes for diseases are small risk
- predictive genetic testing may be difficult/impossible
odds ratio
risk of disease if carrying gene variant/risk of disease if not carrying gene variant
DNA markers and mapping
- too expensive to sequence genome
- “genotype” DNA “markers” (score able differences) at known positions
surrogates for disease mutations
some polymorphisms cause disease. most don’t
commonly used marker types
microsatellites
SNP
CNV
microsatellites
SSLPs, STRPs, SSRs simple sequence repeats multi-allelic ~1/30,000 bp used for forensics
SNP
single nucleotide polymorphism
bi-allelic
~1/50-300 bp
used for association
occurence/allele frequency differs based on pop./ethnic group
each occurs in local context/haplotype of surrounding SNP
SNP Haplotype
- recombinations breaks micro-patterns of polymorphic genes into haplotypes
- recombination not random, cluster in ~10-50kb blocks
- linkage disequilebrium blocks 2x smaller in african than caucasian
- genotype enough SNPs, you can impute variation that wasn’t genotyped
linkage diequelibrium
marker alleles within blocks tend to be co-inherited b/c recombination in blocks is uncommon
CNVs
copy number variants bi allelic, multi allelic, unique common genomic deletions 100s-1000s nt in size detected by SNP patterns most=not causal for human diseases
Disease genes
medelian: 1 gene is sufficient to cause most of phenotype
polygenic/multifactorial: no one gene is sufficient for causing phenotype
candidate gene DNA sequencing
studies gene directly depends on biological candidate gene/positional candidate gene hit from GWAS or other mapping sometimes successful Medelian disorders most hypotheses=wrong!
candidate gene association studies
markers used to test gene/causal variant indirectly
most common type of genetic study
depends on “a priori” biological hypothesis or positional hypothesis
powerful for common risk alleles
most a priori biological hypotheses are wrong
fatal flaws=false positive
candidate gene association study
- causal disease variation in candidate gene tagged by haplotype of polymorphic DNA markers
- Depends on LD
- genotype marker in candidate gene in cases and controls, compare allele frequencies in cases vs controls
genetic association studies
done with reasonable sized # cases/controls (100’s)
simple statistics
if using multiple variants, apply multiple-testing correction
real association doesn’t imply causation, implies LD with causal mutation
almost always yields false positives
why don’t gene association studies work?
- multiple testing correction must include all tests
2. must ethnically match cases and controls (impossible)
how many confirmed genetic associations are false positives?
> 96%
hypothesis free approaches to gene diseases
- genetic linkage analysis
- genome wide association studies (GWAS)
- deep re-sequencing
- exome/genome sequencing
genetic linkage analysis
- search genome for segments co-inherited in “multiplex families”
- assumes affected relatives share susceptibility-identical by descent
- best for mendelian traits
unit of genetic distance/recombination
centimorgan
cM; 1cM=1% recombination btw 2 loci
LOD
log of odds score
likelihood of date if loci linked at theta cM/likelihood of data if loci unlinked
significance is if >=3
GWAS
case control association study
100’s thousands markers tested across whole genome
search for SNP w/ diff allele frequencies (case vs control)
match ethnically
can accurately measure/correct pop stratification
associations must be confirmed w/ independent replication
deep re-sequencing
high throughput DNA sequencing
uses GWAS signals
full genome or exome sequencing
difficult to distinguish causal from non-pathological
exome/genome sequencing
medelian diseases
costs 1-3000 dollars
data interpretation is difficult
can find principally single gene mendelian cause for disorder