W8L2 Gene mapping: linkage and association mapping Flashcards
QTL in the context of quantitative traits
consists in identifying the genetic position of loci involved in quantitative trait variation and estimate their effects.
-QTL mapping allowed the fusion of molecular and quantitative genetic
Method to identify the molecular basis of quantitative trait variation
- inbred lines (cross) through Linkage mapping
- outbred lines (natural population) through Association mapping
How to generate a genetic map
After genotyping a (high) number of markers in a (high) number of individuals from a progeny:
-Calculating linkage between markers
-Assembling markers into linkage groups
Development of saturated maps with all the linkage groups covered with a genetic distance between markers of less than 1cM.
The limiting factor is more the number of genotypes generated in the cross than the number of markers.
Testing marker effects
-Identification of the marker the most tightly linked to the trait.
For a given effect:
the QTL may be close to the marker, thus the genetic effect is well estimated by the marker,
BUT the QTL may be far from the marker, the effect is then going to be underestimated.
Important information on the marker
A marker is by definition: NEUTRAL.
It is NOT the gene/allele responsible for the genetic effect.
A marker potentially co-segregates with the gene(s) underlying the QTL
QTL mapping determines the position(s) on the genetic map where the markers show the strongest linkage with the phenotype.
Interval mapping
Interval mapping estimates effects for every position along the genetic map (≠ genome!) compared to a null threshold generated by permutation (creating a dummy dataset), so that the estimated data are more likely than typically 95% of the permuted data.
Interval mapping leads to the join estimates of:
1- the genetic effect and
2- the position of the QTL (+confidence interval).
what is statistical power and the Power to detect a QTL depend on which factor
The probability of detecting a statistical effect based on a given experimental design is called statistical power.
The power to detect QTL depends:
- on the sample size to gain precise estimate of the variance components
- on the size of the effect / penetrance of the genetic effect, the bigger, the easier to detect.
* The number of markers doesn’t play any role
Increasing the resolution of the QTL mapping
Increasing the number of marker can improve resolution but only until the map is saturated.
Resolution can be improved by increasing the overall number of recombination in the progeny:
- Increase population size.
- Advance the number of generation.
Association mapping in outbred populations: the model step 1
From a wild population, we use the genetic resources to:
-phenotype measure
-Genome sequencing
-Genome wide marker
Association mapping in outbred populations: the output (2)
This very straightforward type of model allows to test in parallel all the SNPs genotyped.
The p-values associated with each SNP is often transformed with a -log(p-value)and presented in the form of a Manhattan plot.
Association mappin with outbreak population: multiple testing
GWAS consist in performing millions of times a statistical genetic test on the same trait. (To prevent association by chance)
This issue warrants a correction of the nominal p-value threshold (often α = 5%) by:
- Using a Bonferroni correction: new p-value threshold = α /number of tested SNPs or
- Modelling explicitly the False Discovery Rate (FDR) by random permutation of the data.
Association mapping in outbred populations: 4- The confounding effect of population structure
Human population as most natural populations are genetically structured i.e. allele frequencies among SNPs are correlated due to shared evolutionary history.
-These correlations likely result in confounding true genetic effects with SNPs that are “just” explaining the level of relatedness or difference among populations.
-The population structure leads to a higher rate of false positive (spurious associations).
Problem with Association mapping in outbred population
-Despite increasing the association threshold and correcting for population structure, GWAS still try to explain the same phenotype information several million times.
-As a consequence, the resulting associated SNPs are not independent, there might be some degree of linkage between SNP resulting in correlated/non-independent information.
-A way of testing independently each SNP effect is to include them together in a multivariate / multilocus model where each SNP is tested while controlling for the variation at other SNP loci.
Association mapping in outbred populations
6- Heterogeneity between SNPs and QTL
SNPs used as markers may not “tag” the same causal allele, this phenomenon is called allelic heterogeneity
-Conversely, multiple mutations of similar effect may have occurred independently, and no SNP is differentiating functional/non-functional alleles.
-These masking effects mostly appear when individuals are sourced from multiple population
Pro and con of QTL mapping
robust
Low power
low resolution
relies only on a genetic map