Statistics Flashcards

1
Q

what is allele frequency

A
  • the number of times an allele is observed in a population
    • obtained by typing a random group of individuals and observing the genotypes
  • used to calculate locus probabilities
    • derived from population databases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is in commercial STR kits

A

contain loci from non-coding regions that were chosen to remain close to hardy-weinberg equilibrium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the correction factor/coancestry coefficient

A
  • uses theta correction
  • often employed to probability calculations of homozygotic loci
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

explain the defendant’s fallacy

A
  • if you have an RMP of 1 in 1 million, the defense will say that LA has a population 5 million people, so there are four other people in LA who have the same profile
  • this is wrong because the RMP represents a chance/likeliness that something will happen
    • not based on literal numbers or populations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is hardy-weinberg equilibrium

A
  • depends on Mendel’s laws of independent assortment of genes during sex cell formation
  • an unachievable mathematical relationship between allele frequencies and genotype frequencies that assumes a perfectly balanced population with constant genetic variation
  • hardy-weinberg formula
    • p^2 + 2pq + q^2 = 1
    • used to calculate locus probabilities
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the hardy-weinberg requirements

A
  • random mating
    • no inbreeding or population substructure
  • large population
    • so allele frequency doesn’t change through genetic drift
  • no mutation
    • to avoid introducing new alleles
  • no natural selection
    • so no alleles are favored over others, changing allele frequencies
  • no gene flow
    • so variability doesn’t increase
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

explain heterozygotes

A
  • less common in real-life than what would be expected with HWE
  • should not be corrected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

explain homozygotes

A
  • more common in real-life than what would be expected with HWE
  • corrected when calculating probabilities
    • uses theta correction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

explain inheritance

A
  • STRs are inherited allele traits
  • inherited in pairs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is likelihood

A
  • conditional probability
  • the chance an event will occur given knowledge that another event already occurred
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the likelihood ratio

A
  • refers to a stable, independent, random flow of alleles within a population
  • directly related to the ability to achieve HWE using the product rule
  • ratio of two probabilities of the same event under different and mutually exclusive hypotheses
    • specific to observed evidence and related individuals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the likelihood ratio hypothesis

A
  • hypothesis 1 (prosecution)
    • probability of the evidence given a presumed individual is a contributor to the evidence
  • hypothesis 2 (defense)
    • probability of the evidence given a presumed individual is not a contributor to the evidence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is linkage equilibrium ( LE)

A
  • a genetic system with stable, independent and random flow alleles within a population
  • allows for the use of the product rule to combine locus probabilities for autosomal STR typing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the minimum allele frequency and how is it used

A
  • minimum allowable frequency (MAF) within a population group
    • used for unobserved alleles and for raising frequencies that fall below to MAF
  • the most equation for MAF is 5/2n
    • 5 is the minimum number of times that an alleles should be seen for a reliable frequency
    • 2n is 2 times the size of the database
  • the MAF allows for a conservative estimate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the modified homozygote equation

A
  • p^2 + p(1-p)(theta)
  • accounts for statistical uncertainty and does not require the assumption of a population to be in HWE
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a population databse

A

refers to a collection of observed alleles and specific frequencies for tested populations

17
Q

what is population genetics

A

principles that are combined with statistics to calculate the statistical significance of STR data

18
Q

explain probability

A
  • DNA statistics are used to determine the probability of a profile in the population
    • used to assign statistical significance
  • probability is a mathematical relationship between the number of times an event is observed compared to the total number of events possible
    • 0 = event will not occur
    • 1 = event will occur
    • between 0 and 1 = may or may not occur
19
Q

what is the product rule

A
  • used to combine independent events
    • ex: individual locus probabilities in a DNA profile
  • a statistical principle allowing for independent or unlinked events to be combined through multiplication
20
Q

what is the prosecutor’s fallacy

A
  • if you have an RMP of 1 in 1 quadrillion, the prosecutor will say that the world has a population of 7 billion people, and how many worlds it would take the see that profile again
  • an RMP is an outcome probability, not an expected probability after multiple trials
21
Q

explain random match probability

A
  • estimated frequency at which a particular STR profile would be expected to occur in a population as determined by the allele frequencies from that population group
    • probability of randomly selecting an unrelated individual from the population who could be a potential contributor to an evidentiary profile
    • only the frequency of the random match is reported (1/RMP)
22
Q

what is statistical analysis

A
  • purpose is to perform statistical calculations, with proper interpretation, on evidentiary DNA profiles to provide as assessment of the significance of an inclusion
  • it quantifies the evidentiary value of the match
23
Q

explain why we use the theta correction

A
  • p^2 does not best represent the homozygotic locus
  • addresses the issue of increased homozygosity and accounts for the difference between population frequency and true HWE
  • should be based upon the relative size and substructure of a population
    • commonly used values are 0.01 (normal) and 0.03 (incest, small populations)