Midterm 2 Flashcards
Signal Detection theory:
- hit
- miss
- false alarm
- correct reject
hit: correct answer»_space; when signal is present and decision is yes
miss: wrong answer»_space; signal is present and decision is no
false alarm: wrong answer»_space; signal is absent and decision is yes
correct reject: correct answer»_space; signal is absent and decision is no
internal response
variable/value that forms basis of observer’s decision (x axis)
criterion on a signal present/absent graph
- false alarm? correct reject? hit? miss?
- before criterion line is no, right of the criterion line is yes
- any area under the signal absent curve (left) that is after the line is a false alarm, and any area before it is a correct reject
- any area under the signal present curve (right) that is after the line is a hit, and any area before it is a miss.
accuracy equation
(#present%hits + #absent%CR)/total**
- need present/absent percentages/numbers in order to calculate accuracy.
**total = present + absent
how to increase accuracy (2)
- how good is your accuracy? comparison
- information acquisition: increases correct responses (hits and CRs)
- criterion change: leads to trade off btwn hits and CRs
- if your accuracy is worse that what would occur by chance, it is shit accuracy
Why could peak accuracy be greater in a 20% present, 80% absent case compared to a 50/50 case?
- since it’s 80% absent, it is good to maximize correct rejects so you would move the criterion more to the right (more conservative»_space; say no more often»_space; maximizing tumor absent correctness). Since it’s only 20% present, chances of misses are low.
- for 50/50, moving it in either direction would have trade-offs
why would you change the criterion? (3)
- when maximizing accuracy»_space; difference in signal present/absent
- special case: when 50/50 present/absent»_space; optimal criterion = where graphs intersect
- when optimizing a parameter other than accuracy (eg. cost)»_space; balance (where they intersect) between cost of FAs vs cost of misses to minimize total cost
calculating total cost (money wasted by incorrect responses)
present%misscost + #absent%FAcost
Discriminability + reducing errors (2)
- being able to distinguish btwn stimuli»_space; errors due to overlap
2 ways to decrease the overlap
- increase separation
- reduce spread
Cohen’s d (d’)
- what does it represent?
- how to increase?
- what if you don’t have sigma?
- worst case?
- represents magnitude of effect of IV on DV (interval/ratio); expressed in units of SD
d’ = separation/spread = (u2-u1)/sigma
- if you don’t have sigma you can used pooled SD sqrt((SD1^2 + SD2^2)/2)
- inc d’ by increasing separation or decreasing spread
worst case scenario: d’ = 0»_space; no separation = no information
parameter vs statistic
parameter: true value of quantity in popn
statistic: value of the same quantity based on a sample (statistic used to estimate parameter)
u vs M
- accuracy or precision?
u = population mean
M = sample mean»_space; unbiased estimator of u
- unbiased = accuracy, not precision
sigma^2 vs s^2
- why squared?
- SD relation?
sigma^2 = popn variance s^2 = sample variance >> unbiased estimator of sigma^2
- s means standard deviation (SD = sqrt of variance)
- SD (s)»_space; unbiased estimator of sigma
Gaussian Distribution
- characteristics
- probability density + total area under curve
Characteristics:
- normal distribution/bell curve
- typically used for weight/height/IQ scores/exam scores
- unimodal
- symmetric
- goes from -inf to inf (No max/min)
- probability of any single value is zero if it is a probability density graph (probability = area under the curve)
- total area under the curve = 1
Gaussian Distribution:
- 1 SD, 2 SD, 3 SD»_space; chance of value occurring?
- SDT warning!
within:
1 SD: 68%
2 SD: 95%
3 SD: 99%
- when calculating SDT, the percentages exclude the tails»_space; beware!
- sampling distribution is based on the assumption that H0 is TRUE!
Uniform Distribution
- each event is equally likely (eg. throwing a fair dice = 1/6 probability)»_space; discrete
Poisson Distribution
- few events vs many events
- usually positively skewed
- used when random events occur at a certain rate over a fixed time period (eg. hourly # of customers at a bank)
- if expecting few events, it will be positively skewed
- if there are more events, distribution will become more symmetric
z-score
- difference in score as a proportion of the SD (with respect to the population!)»_space; units of SD
(basically, how many SD units away from the mean you are) - if you are ranking one score within popn:
(x-u)/sigma - similar to d’
- if you are finding the sample distribution of the sample mean
(Xavg - u)/(sigma/sqrt(n))
“the standard normal”
distribution of z scores
- M = 0 and SD = 1 (same for t- score!)
percentile rank
- how is it similar to z-score?
- what if its gaussian?
percent measurements of score value in the distribution below that value (eg. a score in the 99th percentile = 99% of all scores are below)
- both z-score and percentile rank both look at relative standing
if it’s gaussian we can calculate based on z-score (eg. z=1»_space; one SD from the mean (M = 50)»_space; add 34 to 50 = 84%)
how to calculate percentile from standard normal distribution table
- what if z score is negative?
- what does percentile represent?
- find first 2 digits in the first column, find third digit in the first row
- if z score is negative, do 1 - (positive percentile)
- percentile represents area before (left of) the z-score
Sir Francis Galton: CLT
- rule of thumb
central limit theorem: if x is the sum of identically distributed (uniform) variables, with a non-zero SD, then the distribution of x will approach gaussian
- rule of thumb: mean distribution will be gaussian if n>30
- even if the original data is skewed, the avg will be gaussian
effect size
- small, med, large
describes relationship among variables in terms of size/amt/strength»_space; descriptive»_space; shows extent to which results are meaningful
effect size based on Cohen’s d:
small: 0.2
med: 0.5
large: 0.8
purposes of inferential stats (2)
parameter estimation: estimate value of population parameter based on random sample
hypothesis testing: whether effect occurred by chance or not (probability)
sampling distribution of Xi vs Xavg
- which has smaller SD?
Xi ~ N(u, sigma) Xavg ~ N(u, sigma/sqrt(n)) - N = normal(gaussian) distribution - u is always the same (population) - Xavg has smaller SD because it is divided by sqrt(n) >> gets smaller as sample size increases
confidence intervals
- inc n? sample mean?
- 95%?
- formula?
as n increases, our confidence in the sample mean increases (CI decreases)
- sample mean gets closer to popn mean w increased sample size
- typically CI is 95%»_space; randomly picking a number will have a 95% chance of being within 2SD of u, so shifting that 2SD so that the value picked is the center indicates that there is a 95% chance that u is within 2SD of the value = CI
- CI = Xavg +/- z*sigma/sqrt(n)
finding z-score from CI
- 95% vs 99% CI z score difference?
- pick CI (eg. 95%)
- add in one of the tails (because 95% is the 2SDs so you need the end of one of the tails since only one tail is excluded by z score)»_space; 1-0.95 = 0.05 (both tails)»_space; one tail = 0.025
- find 0.975 on z-score table»_space; find z score
- the more confidence you want (eg. 95 vs 99%) the bigger your z score will be
student’s t distribution
- when do you use t?
- is z or t wider?
- effect of sample size?
- df?
- CI?
- looks the same as gaussian except values don’t extend to inf (not gaussian!)
- used when you don’t have the population SD (sigma), and you only have a sample SD
- in general, t will have a wider distribution because it is less exact»_space; more uncertainty
- unlike z, t distribution depends on sample size: as you increase n, t gets closer to gaussian (approximates normal distribution)
- defined by degrees of freedom (n-1)
CI = Xavg +/- t*s/sqrt(n)
s/sqrt(n) is standard error of the sample mean!
finding t score from CI
- what does increasing sample size do?
- t score is a single tail’s probability while z score is everything other than one tail»_space; find the value of one tail based on CI (eg. 95%»_space; one tail = 0.025)
- find degrees of freedom (n-1)
- find corresponding t value
- increasing df (inc sample size) will decrease t value»_space; approximates gaussian ( eg. as df approaches inf, t approaches 1.96, which is the z score for gaussian 95 CI)
error bars
- narrow vs wide?
- common error bars?
narrow error bar = increased confidence
wide error bar = lots of noise, less confidence
Common error bars:
- range
- SD»_space; will not change, no matter the size of the sample, if it represents the population SD
- SE»_space; most common in BNS (SD/sqrt(n))»_space; decreases as n increases
- CI»_space; recommended»_space; like a stat test»_space; Xavg +/- tSE»_space; based on t and SE, so increasing sample size, which affects both of these, will decrease CI by a lot.
H0 vs H1
H0: null hypothesis
H1: research hypothesis
p > a vs. p < a
p > a: retain H0»_space; fail to reject»_space; results are not significant
p < a: not significant
probability density graph for p value vs alpha level
- if value (observation) is greater than criterion, reject H0»_space; greater obs value = less area under curve = smaller p value
- area under the curve and observation line = p value
- area under the curve and criterion = alpha level (typically 0.05»_space; cannot be greater/more liberal)
- criterion will change based on alpha
Standard error
- t value?
- if you used sample SD and not population SD, you will have a t distribution»_space; SE will be the standard dev of the sampling distribution of the sample mean Xavg
- since Xavg dist is gaussian, SE can be used to approx CI (eg. SE = 2»_space; 95% CI)
- t value = (Xavg - u)/SE
- find t value of data (obs) and compare to tcrit from table
- if t is bigger than tcrit then you reject H0
- t distribution is for the null hypothesis»_space; big t is
unlikely therefore we would reject H0
one sample t test:
- 1 tail
- 2 tail
- is it harder to reject 1 tail or 2 tail?
1 tail: directionality specified, all of probability a is in one tail only
two tails: directionality not specified (can be bigger or smaller than value you are comparing to); must divide a by 2 because it can be either of the tails
- harder to reject 2 tails because there is possibility for error on either side, so tcrit (dependent on alpha) will be further from the mean; less likely to get data that is further from the mean
95% CI from t value (2 tails)
- if you’re finding 95% CI based on t, you must use the NON-DRECTIONAL t value because it takes both tails into account
- a 2 tailed one sample t test is equivalent to asking if a value is within the 95% CI
- CI = Xavg +/- t*SE»_space; calculate the interval
does failing to reject the H0 prove it’s correct?
NO. It only shows that the data is consistent w the H0. Study could just be underpowered and not show the effect because it couldn’t detect the difference
power
- correct decisions (2)
- type 1 error
- type 2 error
probability of making a correct decision of rejecting an incorrect H0 (assuming it is actually false)
- correct decision:
- H0 is incorrect and you reject H0 (hit)
- H0 is correct and you retain H0 (correct reject) - type 1 error (a): H0 is correct and you reject H0 (false alarm); occurs when you get a large t value just by chance
- type 2 error (B): H0 is incorrect and you retain H0 (miss)
- B depends on alpha level (if a is too low it increases chance of type 2 error»_space; miss), sample size (inc sample size dec chance of type 2), and effect size (a small effect would go undetected»_space; inc chance of type 2 error»_space; depends on cohen’s d)
(reject H0 = yes, retain H0 = no; H0 incorrect = signal present, H0 correct = signal absent)
- Power = 1-B
How can we increase power? (3)
- increase P(type 1 error)»_space; would increase false alarm rate, and misses would go down
- not helpful: type 1 error (a) cannot exceed 5% - increase separation between the sample mean and the population mean (H0 value)
- not helpful: this is not possible. you can’t change the data - increase the sample size
- this is helpful! increasing n will decrease SE (spread)»_space; tcrit is moved closer to the population mean»_space; easier to reject the H0»_space; increases power
*even if you reject H0, you are not proving H1 correct»_space; other hypotheses that align w data could also be correct
Choice of statistical test
- scale of measurement: ratio/interval or categorical
- # of groups/levels of IV: 1, 2, 3+
- experimental design: within or between subjects
- scale of measurement
1. ratio/interval: t test/ANOVA
2. categorical: chi-square - # of groups/levels of IV
- 1 group = sample t test
- 2 groups = 2 sample t test
- 3+ groups = ANOVA
- experimental design
1. within sub: paired t test, repeated measures ANOVA
2. between sub: 2 sample t test, one way ANOVA