Lecture 2: Genetic Variations Flashcards
Genetic difference between individuals
0.01%
Types of Genetic Variation
- SNPs
- Copy number variations (CNVs)
- Variable number tandem repeats (VNTRs)
- Chromosomal number variations
SNPs
single nucleotide polymorphism is a change in a nucleotide at some point in the DNA sequence
- 90% of genetic variations
- 10 million SNPs in the human genome (one every 300 base pairs approximately)
- a given SNP may exist in approx 1% of the population
- most SNPs have NO major consequence
SNPs in coding vs non coding regions
SNPs in coding regions may affect the protein produced while SNPs in non-coding regions have been associated with changes in gene expression, new traits and altered risk for disease (changes in regulatory regions).
SNPs and blue eyes
Blue eyes resulted from an SNP in a non-coding region of the HERC2 gene which regulates levels of OCA2 (determinant of iris colour)
Inheritance of SNPs
SNPs are often inherited in groups, and ones that are close together on the chromosome tend to be inherited together (meiotic recombination).
meiosis
production of gametes which contain half the genetic information of the full organism
recombination hotspots
sites on chromosome where recombination is more frequent. The hotspots form the borders of haplotype blocks.
Haplotype
common patterns of human inheritence
CNVs
copy number variations
- during replications, large DNA segments (1000 - 5 mil) are duplicated or deleted
- rare evnt
- in a given locus, copy numbers can vary from 10s to the 100s
- occur in certain areas of the genome (5-10%) aka we have CNV hotspots
Abnormal CNV numbers are associated with
autism, SZ, and learning disabilities
List mutations by size of genome affected
chromosomal number variations > copy number variants > VNTRs > SNPs
VNTRs
variable number tandem repeats
- a short DNA sequence repeated at a specific locus
(NNN-ATAT-NNN allele 1, NNN-ATATAT-NNN allele 2) just a couple of base pairs
- used for DNA fingerprinting as a kind of genetic barcode (forensics) (looks at the VNTRs at certain loci)
VNTRs are associated with what diseases
huntington disease, fragile X syndrome,
chromosomal number variations are caused by
non-disjunction during meiosis (causes either monosomy or trisomy)
Trisomy 18 and 13
edward and patau syndrome
Only non lethal monosomy
turner syndrome where there’s a loss of the sex chromosome
Methods for understanding SNPs
- GWAS
- PS and GPS (built on GWAS)
- GCTA
Tag SNPs
Representative SNPs that are used to represent a block of SNPs (as they often occur together)
- identifying a relatively small population of Tag SNPs (300 000 to 50 000) allows you to predict the remaining SNP composition (around 10 million)
DNA Microchip array
- what is it
- how is it used
- thin plate covered in millions of tiles, each tile contains a ssDNA sequence that is complementary to a Tag SNP sequence
- apply your fragmented and fluorescently labeled DNA sample to the chip, see what anneals
- used to identify tag SNPs
GWAS
genome wide association studies: examines relationship between SNPs and particular traits
- collect DNA samples from individuals with the trait and individuals without the trait
- in both samples (case and control) a SNP is identified via a chip microarray - need to figure out the frequency of the SNP in each population
- then need to determine which of the SNPs is significantly associated with the trait using Odds ratio
calculating the odds ratio
1. odds of SNP x in Case: # case with snp x/ #case without snp x
2. odds of SNP x in control: # control with SNP/ # control without snp
OR is the ratio of those above, so just 1 divided by 2
How many SNPs associated with autism by GWAs?
16
Problems with GWAS
- due to high volume of statistical tests, GWAs are associated with a high liklihood of false positives (type 1 error)
- often difficult to replicate
- most allele-trait associations are very weak (ONLY EXPLAIN A VERY SMALL PROPORTION OF VARIABILITY IN A GIVEN TRAIT LESS THAN 1 %)
GWAS usually identifies what kind of variants?
common variants with small effects
Polygenic score
- how does it work
builds off of GWAS
- uses a group of SNPs together (because usually an allele has an extremely small effect on a trait)
- run a GWAS in a large population as a discovery sample and find SNP-trait associations using very stringent threshold
- then use SNP-trait associations in the discovery sample to predict a trait in another independent target sample
- genotype everyone in sample and identify alleles they have for each gene then count number of trait-associated alleles weighted by their effect size and plug into PS equation to get a PS for each individual
- NOWWW determine the association between the PS and the trait in the target sample using regression or something (aka see if the ps can predict the trait in the target sample)
example of PS with a disease
PS was used to find muiltiple genes associated with SZ - was able to predict in an independent target sample. Was also used to predict depression and alcoholism.
PS vs. GWAS
Polygenic Scores explains much more variance in a trait than any one GWAs does but still not that much more variance. For both:
- low predictive power, not clinically useful
- can be used for dichotomous or continuous traits (have it or not vs. scaled, aka SZ vs. IQ)
- can be applied across traits
GPS
Genome-wide PS
- polygenic score uses ONLY teh best gene-trait associations but GPS uses ALL the gene-trait associations found in the discovery sample
- can explain more variability in the trait than PS but still not clinically useful - can only explain up to 10% of variability
GCTA
Genome-wide Complex Trait Analysis
- compares chance similarity in all common SNPs between biologically unrelated individuals
- builds off the fact that between any two people, chance similarity exists so basically if a trait is driven by SNPs, individuals sharing those SNPS should be more similar in terms of traits
- using common SNPs we can explain 10-20% of variation in traits which is better but still not great.
Example of selection pressure
food supply (strong heritability to bodyweight suggesting a strong genetic basis)
If obesity is such a big problem, why are genes for weight gain still so common?
- might have other beneficial functions
- might not kick in with bad effects until after reproductive age
- might not have selection pressure against them
Sickle cell anemia
part of what makes a great gene is the environment (other example, genes for anxiety might help in other situations like being alert to environment)
- the allele that causes sickle cell anemia when an individ is homozygous recessive provides malaria resistance in heterozygotes