Association Analysis Flashcards
What is association analysis applied to?
Common
Complex
What is genetic association?
Presence of variant allele
In affected individuals
At a higher frequency
Than people who do not have the trait
What is the genotype referring to?
Both alleles
Single locus
What is the haplotype?
Order of alleles 1 chromosome (not the other in the pair)
What are ‘cases’ in a case-control study?
Subjects
That have disease of interest
What features must a control have?
Equal for non-disease traits
Definition - consistent
How is a Case-Control Association Study done?
2 groups - control and cases
Find those with rare variants at loci of interest
Apply statistics to see if effect size for those with variant is significant
Identify genomic region associated with disease via markers
What does a good case-control study have?
- Big sample
- Equal control vs affected
- Many markers - technology
- PLINK
- Replicative
Why should we use millions of genetic markers?
To capture huge diversity in population
What is the ideal genetic marker?
Polymorphic, random but fixed position, frequent, stably inherited, easy to genotype
How common are SNPs in the genome and how are they generated?
1 in 300
Fault mismatch repair during mitosis
Where are SNPs found?
Where are the majority of them?
Coding regions (synonymous, non-synonymous, missense)
Non-coding regions (promoter, terminator (marks end of transcription), splice site (altered mRNA))
Intergenic region
Most are in non-coding/ intergenic regions - bulk of genome
What is dbSNP?
What is the rs number?
What is either side of the SNPs?
SNP database
Unique SNP number
Flanking sequences
What are the 2 types of alleles of an SNP?
How many major and minor alleles of C and G in population of 1000 people, if their freq are C 0.567 and G 0.433
Major, minor - less common
Freq = 1
1134 C and 866 G (each person has 2 alleles)
What is GWAS?
Whole genome study
See if set of genetic variants in different individuals
Are associated with a trait
Markers in whole of genome (i.e. know their seq)
Placed on microarray beads
Put sample of SNPs in
Find SNPs as they bind to markers
Look for association between SNP, disease and marker via chi-squared test
How is GWAS data presented?
What do the peaks show?
Manhattan
X - Marker position and thus SNP pos
Y - minus log10 of p value of association with disease (i.e. higher the number, stronger the association)
If a peak is present
Indicates significant results and that an SNP is associated with disease
Does not find the gene associated with the disease, it finds the genomic region associated i.e. via the markers
What was the WTCCC?
1st big GWAS
Where is there a peak for someone’s QT interval in CV health?
What is the SNP region called?
Why is this misleading?
Region in chromosome 1 NOS1AP region
A lot of associated SNPs are in adjacent gene OLFML2B - GWAS showed region but not exact location of disease causing variant
What is a meta-analysis, what is it’s benefit and drawback?
Very large studies where you analyse all pooled post experimental data
Improves strength of results
Difficult to do - easier to combine smaller studies amongst individuals/ companies + their resources together
Known as a consortium
What is the GWAS catalog?
Store of genes and associations
What link has been established with obesity over the past few years?
Cancer
Strongly genetic
What kind of studies have proven obesity is strongly genetic and by what %?
Twin studies - body shape
Adoption studies - BMI with siblings
Family studies - inheritance
What associated gene with obesity does everyone find and how is this proven to be significant?
FTO gene p value (genomewide accepted) < 5x10-8
How can obesity be defined?
waist size, fat mass, BMI adjusted according to waist hip ratio etc.
Problems with GWAS?
- Low contribution of associations to overall genetic component of the disease (obesity is 30% but usually <5%)
- Why?
- Not enough studies done
- Many common SNPs (of minor allele frequency of greater than 5%) of small effect
- Rare SNPs may be contributing
- Copy Number Variation and epigenetic variation not looked at using the method
- Heritability is overestimated (particularly in twin studies)