Linkage analysis Flashcards
What is genetic variation?
Differences in the DNA sequence between individuals in a population
Variation can be
Inherited or due to environmental factors (e.g. drugs, exposure to radiation)
4 effects of genetic variation
- Alteration of the amino acid sequence (protein) that is encoded by a gene
- Changes in gene regulation (where and when a gene is expressed)
- Physical appearance of an individual (e.g. eye colour, genetic disease risk)
- Silent or no apparent effect
Linkage analysis
Utilize common (most likely silent) variation to inform us about protein-altering variation
Why is genetic variation important?
- Genetic variation underlies phenotypic differences among different individuals
- Genetic variation determine our predisposition to complex diseases and responses to drugs and environmental factors
- Genetic variation reveals clues of ancestral human migration history
Mutation/polymorphism
Errors in DNA replication. This may affect single nucleotides or larger portions of DNA
Germline mutations
Passed on to descendants
Somatic mutations
Not transmitted to descendants
De novo mutations
New mutation not inherited from either parent
Homologous recombination
Shuffling of chromosomal segments between partner (homologous) chromosomes of a pair
Gene flow
The movement of genes from one population to another (e.g. migration) is an important source of genetic variation
Polymorphism
DNA sequence variant that is common in the population. In this case no single allele is regarded as the ‘normal’ allele. Instead there are two or more equally acceptable alternatives
Mutation
Rare change in the DNA sequence that is different from the normal (reference) sequence. The ‘normal’ allele is prevalent in the population and the mutation changes this to a rare ‘abnormal’ variant
Difference between mutation and polymorphism
A polymorphism, the least common (minor) allele must be present in ≥1% of the population)
MAF
Minor Allele Frequency - how common the least common allele is in the general population.
Variant and polymorphism
Variant MAF of 1% or greater is classed a polymorphism
Variants and mutations
Minor allele is seen in less than 1% of the population (e.g. 0.001, equivalent to 1 in 1000), the variant may be classed as a mutation
Homologous recombination
Crossing over: reciprocal breaking and re-joining of the homologous chromosomes during meiosis
Results in exchange of chromosome segments and new allele combinations
New allele combination
A,B, c and a, b, C
Genotype
Genetic Makeup of an individual
Phenotype
Physical expression of the genetic makeup
Genes
Found in alternative versions called alleles
Homozygous
Identical alleles
Heterozygous
Two different alleles
Haplotype
Group of alleles that are inherited together from a single parent
Homozygosity
Represented here by the identical colours at locus 1 – both the paternal and maternal copies of the gene are the same (i.e. blue)
Heterozygosity
represented by the different colours at loci 2 and 3 – the paternal and maternal copies of each gene are different (locus 2: red and green; locus 3: blue and purple)
Mendelian/monogenic
Disease that is caused by a single gene, with little or no impact from the environment (e.g. PKD)
Non-Mendelian/Polygenic
Diseases or traits caused by the impact of many different genes, each having only a small individual impact on the final condition (e.g. psoriasis)
Multifactorial
Diseases or traits resulting from an interaction between multiple genes and often multiple environmental factors (e.g. heart disease)
Mendelian
One gene = one disease
Complex
Many genes = one disease
Linkage analysis
A method used to map the location of a disease gene in the genome
Linkage
Refers to the assumption of two things being physically linked to each other
Genetic mapping
Look at the information in blocks or regions
Physical maps
Provide information on the physical distances between landmarks (e.g. stations on a tube map) based on their exact location
Principles of genetic linkage
The tendency for alleles at neighbouring loci to segregate together at meiosis
Cross - overs
More likely to occur between loci separated by some distance than between loci close together on the chromosome
Recombinant
The recombinant chromosomes break up the haplotypes, creating new allele combinations of the ‘1’ and ‘2’ alleles (i.e. allele A1 with B2, and allele A2 with B1)
Non - recombinant
Chromosomes are the same as the original (i.e. alleles A1 and B1 segregate together, A2 and B2 segregate together)
Genetic markers
Microsatellite, Single Nucleotide polymorphisms
Single Nucleotide polymorphism
Single Base substitutions
Typically biallelic (i.e. 2 alleles)
SNPs are not the disease causing variant.
They are polymorphisms that are used in linkage analysis to identify the likely location of the disease gene
How many SNPs are required for Genome wide linkage analysis?
Approx 6,000 SNP
Usage pf Microsatellite genotyping
- DNA fingerprinting from very small amounts of material
- Standard test uses 13 core loci making the likelihood of a chance match 1 in three trillion
- Paternity testing
- Linkage analysis for disease gene identification
Microsatellite genotype
PCR- based method that is used to amplify highly repetitive regions of the genome
Fluorescent genotyping
- fluorescently-tagged PCR primers
- Allows for multiplexing of PCR products with different colours and fragment lengths
- Fragment sizes separated down to 1bp resolution
- Look at the 3 microsatellite genotypes on the left
SNP genotyping microarrays
- Provides genome-wide coverage of SNP markers
- SNPs are proxy markers; NOT the causal disease variants
- Can amplify thousands of markers in a single experiment
- Alleles are identified by relative fluorescence
- homozygous for allele 1 = green signal
- homozygous for allele 2 = red signal
- heterozygous (1/2) = yellow signal
LOD
Logarithm of the ODds score
Statistical analysis of linkage
- The probability of linkage can be assessed using a LOD score
- Assesses the probability of obtaining the test data if the two loci are linked, to the likelihood of observing the same data purely by chance
- i.e. calculates a likelihood ratio of observed vs. expected (no linkage, θ=0.5)
LOD scores
Higher LOD score means - higher likelihood of linkage
• LOD scores are additive – different families linked to the same disease locus will increase the overall score
- A LOD score ≥ 3 is considered evidence for linkage
- Equivalent to odds of 1000:1 that the observed linkage occurred by chance
- Translates to a p-value of approximately 0.05
- A LOD score ≤ -2 is considered evidence against linkage
Different linkage soft
- Vitesse
- PLINK
- MERLIN
- Alohomora
- Fastlink etc…
Parametric analysis
• Specifies analysis parameters (e.g. inheritance pattern, disease allele frequency, penetrance)
Non-parametric analysis
- No parameters specified
* Looks for allele sharing between affected individuals
Adams - oliver syndrome
Adams-Oliver syndrome (AOS) is a rare development disorder characterised by birth defects of the limbs and scalp. The photographs highlight the wide range of severity (clinical heterogeneity). A proportion of patients also have associated features, including neurological, cardiac or vascular defects.
It is believed to be caused by a problem during vascular development in the womb.
Linkage analysis and AOS
Linkage analysis in two autosomal dominant families detected a statistically significant locus on chr 3
Analysis of two large autosomal dominant AOS families was conducted using ~6,000 SNP markers.
Merlin software was used for linkage analysis, under a model of autosomal dominant inheritance with reduced penetrance (85%).
Process of identifying the gene for AOS
- Refinement of minimal linkage interval using microsatellite and SNP markers across the region
- Maximum LOD score of 4.93 at marker rs1464311
- Maximum linkage interval defined by markers D3S3670 and rs1127030
We used a program called GeneDistiller to identify which genes were located within our linkage peak
We decided to focus on our critical (smallest) linkage interval, between 115 Mb and 121 Mb.
- Within this region, there were 26 genes