Statistics Flashcards
what is allele frequency
- the number of times an allele is observed in a population
- obtained by typing a random group of individuals and observing the genotypes
- used to calculate locus probabilities
- derived from population databases
what is in commercial STR kits
contain loci from non-coding regions that were chosen to remain close to hardy-weinberg equilibrium
what is the correction factor/coancestry coefficient
- uses theta correction
- often employed to probability calculations of homozygotic loci
explain the defendant’s fallacy
- if you have an RMP of 1 in 1 million, the defense will say that LA has a population 5 million people, so there are four other people in LA who have the same profile
- this is wrong because the RMP represents a chance/likeliness that something will happen
- not based on literal numbers or populations
what is hardy-weinberg equilibrium
- depends on Mendel’s laws of independent assortment of genes during sex cell formation
- an unachievable mathematical relationship between allele frequencies and genotype frequencies that assumes a perfectly balanced population with constant genetic variation
- hardy-weinberg formula
- p^2 + 2pq + q^2 = 1
- used to calculate locus probabilities
what are the hardy-weinberg requirements
- random mating
- no inbreeding or population substructure
- large population
- so allele frequency doesn’t change through genetic drift
- no mutation
- to avoid introducing new alleles
- no natural selection
- so no alleles are favored over others, changing allele frequencies
- no gene flow
- so variability doesn’t increase
explain heterozygotes
- less common in real-life than what would be expected with HWE
- should not be corrected
explain homozygotes
- more common in real-life than what would be expected with HWE
- corrected when calculating probabilities
- uses theta correction
explain inheritance
- STRs are inherited allele traits
- inherited in pairs
what is likelihood
- conditional probability
- the chance an event will occur given knowledge that another event already occurred
what is the likelihood ratio
- refers to a stable, independent, random flow of alleles within a population
- directly related to the ability to achieve HWE using the product rule
- ratio of two probabilities of the same event under different and mutually exclusive hypotheses
- specific to observed evidence and related individuals
what are the likelihood ratio hypothesis
- hypothesis 1 (prosecution)
- probability of the evidence given a presumed individual is a contributor to the evidence
- hypothesis 2 (defense)
- probability of the evidence given a presumed individual is not a contributor to the evidence
what is linkage equilibrium ( LE)
- a genetic system with stable, independent and random flow alleles within a population
- allows for the use of the product rule to combine locus probabilities for autosomal STR typing
what is the minimum allele frequency and how is it used
- minimum allowable frequency (MAF) within a population group
- used for unobserved alleles and for raising frequencies that fall below to MAF
- the most equation for MAF is 5/2n
- 5 is the minimum number of times that an alleles should be seen for a reliable frequency
- 2n is 2 times the size of the database
- the MAF allows for a conservative estimate
what is the modified homozygote equation
- p^2 + p(1-p)(theta)
- accounts for statistical uncertainty and does not require the assumption of a population to be in HWE
what is a population databse
refers to a collection of observed alleles and specific frequencies for tested populations
what is population genetics
principles that are combined with statistics to calculate the statistical significance of STR data
explain probability
- DNA statistics are used to determine the probability of a profile in the population
- used to assign statistical significance
- probability is a mathematical relationship between the number of times an event is observed compared to the total number of events possible
- 0 = event will not occur
- 1 = event will occur
- between 0 and 1 = may or may not occur
what is the product rule
- used to combine independent events
- ex: individual locus probabilities in a DNA profile
- a statistical principle allowing for independent or unlinked events to be combined through multiplication
what is the prosecutor’s fallacy
- if you have an RMP of 1 in 1 quadrillion, the prosecutor will say that the world has a population of 7 billion people, and how many worlds it would take the see that profile again
- an RMP is an outcome probability, not an expected probability after multiple trials
explain random match probability
- estimated frequency at which a particular STR profile would be expected to occur in a population as determined by the allele frequencies from that population group
- probability of randomly selecting an unrelated individual from the population who could be a potential contributor to an evidentiary profile
- only the frequency of the random match is reported (1/RMP)
what is statistical analysis
- purpose is to perform statistical calculations, with proper interpretation, on evidentiary DNA profiles to provide as assessment of the significance of an inclusion
- it quantifies the evidentiary value of the match
explain why we use the theta correction
- p^2 does not best represent the homozygotic locus
- addresses the issue of increased homozygosity and accounts for the difference between population frequency and true HWE
- should be based upon the relative size and substructure of a population
- commonly used values are 0.01 (normal) and 0.03 (incest, small populations)