Database Analysis Flashcards
Population databases purpose
provide genotype and allele frequencies for estimating random match probabilities
Creating databases gathering samples
local bloodbanks/hospitals no personal identification hopefully unrelated and random self-identify racial/ethnic profile usually >100 per group need IRB (institutional review board) approval
Allele frequency
allele count/2*people sampled
Genotype frequency
genotype count/people sampled
Minimum allele frequency
observed allele 5 times for reliable frequency estimate <5, inflated frequency used in calculations to be conservative
5/2N
Allele/genotype frequencies
not greatly affected by increasing sample size above a few hundred
broad racial/ethnic categories usually sufficient
unless working with a small, isolated group CODIS PopStats allele frequencies likely sufficient
HWE and LE tests
before calculating probabilities, check for deviations from HWE and LE
non-independence of alleles or loci
null alleles
Statistical packages
GENEPOP DNATYPE arlequin power marker PopStats (CODIS)
Summary statistics
h- expected homozygosity H- expected heterozygosity ne- effective number of alleles PD- power of discrimination PE- power of exclusion