GWAS in prokaryotes Flashcards
GWAS
- define
association analysis performed w/ panel of polymorphic markers adequately spaced to capture most of linkage disequilibrium info in entire genome
Study designs
Family based
OR
Case control
Linkage disequilibrium
Genes close together in genome
- closer they are = higher the linkage
= more likely to be inherited together
Study design
- cases
- controls
> those with the phenotype of interest e.g. disease
presumed to have high prevalence of susceptibility alleles
> those w/out phenotype
presumed to have lower prevalence of such susceptibility alleles
case + controls ideally similar in majority of other factors
Misclassification bias
study participant categorised into incorrect category
-> alters the observed association or research outcome
what can identification of susceptibility variants lead to?
*novel biological insights
-> clinical advances
-> therapeutic targets
OR
biomarkers
OR prevention
*improved measure of individual aetiological processes
-> personalised medicine
–> diagnostics
OR
prognostics
OR
therapeutic optimisation
Genotyping
- what is it?
Looking for a nt variation associated w/ a given phenotype
e.g. GC change associated with a disease
AT means you don’t have disease
Genotyping
- process
- Extract, amplify + fragment DNA
- Either:
Microarray
OR
Sequencer - Genotype calling
- SNP genotype
Genotype calling
Determining genotype for each individual
- typically only done for positions in which a SNP or a ‘variant’ has already been called (=estimated)
Significance of hits
- Contigency tables (Fisher’s Exact Test)
Gives a p values for the significance of the SNP being associated w/ disease
Sum all probabilities for observed + all more extreme values with same marginal totals to compute probability of null hypothesis
Does the affected or control group exhibit Population Stratification?
- what is this?
- what can it cause?
- how is this controlled?
When subpopulations exhibit allelic variation because of ancestry
Can cause false +ves if there are SNP differences in the case + control population structures
Control for this by testing control SNPs for general elevation in X^2 distribution between cases + controls
Associated haploblocks
Linkage disequilibrium organises genome into haplotype blocks
Haplotype block
region of genome where there’s little evidence of a history of genetic recombination
contain only a small number of distinct haplotypes (group of alleles inherited together from 1 parent)
Bottom-up approach
Starts w/ DNA sequence
-> tests effect on phenotype
Top-down approach
Starts w/ phenotype + associates it w/ particular genomic elements
(by 2010 had large bacterial collections)