20.02.06 SNP Arrays Flashcards
What is a SNP
- Single nucleotide polymorphism
- A DNA variation occurring commonly within a population (>1%) where there is a single nucleotide change.
Is the distribution of SNPs homogenous
No. SNPs occur more frequently in non-coding regions
What can be used as a predictor of SNP density
- Microsatellites.
- AT microsatellites are found in regions of significantly reduced SNP density
What two types of SNPs are there in coding regions
- Synonymous (do not affect protein sequence, can still affect function)
- Non-synonymous (change amino acid sequence)
What two types of non-synonymous changes are there
- Missense
- Nonsense
What is gnomAD
- Genome Aggregation Database
- 123,136 exomes
- 15,496 genomes.
- From unrelated individuals
3 main components of a SNP array
- Immobilised allele-specific oligonucleotide probes
- Fragmented nucleic acid sequences of target labelled with fluorescent dye
- Detection system (converts probe signal intensity to genotype)
What does SNP array signal intensity depend on
- Amount of target DNA in the sample
- Affinity between target DNA and probe (SNP mismatch will reduce binding efficiency)
What are the two leading technologies
Illumina and Affymetrix
How does the Illumina SNP array work
- Illumina beads have a 50nucleotide sequence attached, which is complementary to the sequence adjacent to the SNP site
- Single base extension that is complementary to the allele carried by the DNA, results in an appropriately coloured signal
How does the Affymetrix SNP array work
- 25 nucleotide probes, for both allele
- Location of SNP varies from probe to probe
- Target DNA binds to both probes regardless of allele present
- When the target sequence is complementary to all 25 nucleotides of the probe, the signal strength is strong
- Partial homology will produce a weaker signal
How are SNPs chosen for the assay
-Variability in a population
Theory of B allele chart
- BB homozygotes have data value of 1.
- AA homozygotes have data value of 0
- AB hets are 0.5
- SNP data is plotted as the B allele frequency
- B allele frequency for each SNP is calculated by B/A+B
e. g. - B hom= 2/(0+2)=1.0
- A hom= 0/(2+0)=0
- Hets= 1/(1+1)= 0.5
Using B allele chart. How is a duplication shown
- B duplicated= B/(A+B) = 2/(1+2)= 0.666
- A duplicated= 1/(1+2)= 0.333
Using B allele chart. How is a deletion shown
- B deleted= B/(A+B) = 0/(1+0)= 0.0
- A deleted= 1/(0+1)= 1.0
Using B allele chart. How is a mosaic deletion shown
- 20% B deleted= B/(A+B) = 0.8/(1+0.8)= 0.444
- 20% A deleted= 1/(0.8+1)= 0.555
Factors affecting LogR Ratio
- Amplification (larger LogR ratio), amplification of 1 homologoue will result in diluting out of AB alleles.
- MCC. Greater the MCC the greater the divergence away from expected 1.0, 0.5, 0.0 values
- Nullisomy= homozygous loss, leads to a drop in LogR Ratio
- homozygous UPD. Both identical copies of chromosome are inherited from one parent. Every SNP will be hom.
- Copy number neutral LOH. No gains or losses but significant loss of heterozygous SNPs. Indicator of consanguinity or LOH in malignancy
Postnatal applications of SNP arrays
- First line test for dev del and/or dysmorphism patients.
- long stretches of homozygosity can unmask potential recessive diseases where clinician suspects a particular gene. Although further testing is needed to confirm the hom variant.
Considerations for SNP arrays in prenatal testing
-Do not require sex-matched controls therefore sex of fetus is not needed before set up.
SNP arrays in oncology
- LOH and allelic imbalance: to differentiate between cancer subtypes
- CNV: copy gain of 8q, where TPD52 is located (overexpressed in prostate cancer)
- LOH and CNV analysis: looks for UPD. e.g. in acute meyloid leukemia, 20% have UPD.
- Methylation: compare presence and absence of DNA sequences between the methylation sensitive and non-sensitive enzymes.
- allele specific gene expression: use cDNA instead.
Limitations of SNP arrays
- Can’t detect balanced rearrangements, gene fusions, whole genome ploidy changes
- Mosaicism under 20-30% is not reliable. SO can’t be used for minimal residual disease detection