Midterm 2 Flashcards

1
Q

Signal Detection theory:

  • hit
  • miss
  • false alarm
  • correct reject
A

hit: correct answer&raquo_space; when signal is present and decision is yes
miss: wrong answer&raquo_space; signal is present and decision is no
false alarm: wrong answer&raquo_space; signal is absent and decision is yes
correct reject: correct answer&raquo_space; signal is absent and decision is no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

internal response

A

variable/value that forms basis of observer’s decision (x axis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

criterion on a signal present/absent graph

- false alarm? correct reject? hit? miss?

A
  • before criterion line is no, right of the criterion line is yes
  • any area under the signal absent curve (left) that is after the line is a false alarm, and any area before it is a correct reject
  • any area under the signal present curve (right) that is after the line is a hit, and any area before it is a miss.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

accuracy equation

A

(#present%hits + #absent%CR)/total**

  • need present/absent percentages/numbers in order to calculate accuracy.

**total = present + absent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how to increase accuracy (2)

- how good is your accuracy? comparison

A
  • information acquisition: increases correct responses (hits and CRs)
  • criterion change: leads to trade off btwn hits and CRs
  • if your accuracy is worse that what would occur by chance, it is shit accuracy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why could peak accuracy be greater in a 20% present, 80% absent case compared to a 50/50 case?

A
  • since it’s 80% absent, it is good to maximize correct rejects so you would move the criterion more to the right (more conservative&raquo_space; say no more often&raquo_space; maximizing tumor absent correctness). Since it’s only 20% present, chances of misses are low.
  • for 50/50, moving it in either direction would have trade-offs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

why would you change the criterion? (3)

A
  • when maximizing accuracy&raquo_space; difference in signal present/absent
  • special case: when 50/50 present/absent&raquo_space; optimal criterion = where graphs intersect
  • when optimizing a parameter other than accuracy (eg. cost)&raquo_space; balance (where they intersect) between cost of FAs vs cost of misses to minimize total cost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

calculating total cost (money wasted by incorrect responses)

A

present%misscost + #absent%FAcost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Discriminability + reducing errors (2)

A
  • being able to distinguish btwn stimuli&raquo_space; errors due to overlap
    2 ways to decrease the overlap
  1. increase separation
  2. reduce spread
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cohen’s d (d’)

  • what does it represent?
  • how to increase?
  • what if you don’t have sigma?
  • worst case?
A
  • represents magnitude of effect of IV on DV (interval/ratio); expressed in units of SD

d’ = separation/spread = (u2-u1)/sigma

  • if you don’t have sigma you can used pooled SD sqrt((SD1^2 + SD2^2)/2)
  • inc d’ by increasing separation or decreasing spread

worst case scenario: d’ = 0&raquo_space; no separation = no information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

parameter vs statistic

A

parameter: true value of quantity in popn
statistic: value of the same quantity based on a sample (statistic used to estimate parameter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

u vs M

- accuracy or precision?

A

u = population mean
M = sample mean&raquo_space; unbiased estimator of u
- unbiased = accuracy, not precision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

sigma^2 vs s^2

  • why squared?
  • SD relation?
A
sigma^2 = popn variance
s^2 = sample variance >> unbiased estimator of sigma^2
  • s means standard deviation (SD = sqrt of variance)
  • SD (s)&raquo_space; unbiased estimator of sigma
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Gaussian Distribution

  • characteristics
  • probability density + total area under curve
A

Characteristics:

  • normal distribution/bell curve
  • typically used for weight/height/IQ scores/exam scores
  • unimodal
  • symmetric
  • goes from -inf to inf (No max/min)
  • probability of any single value is zero if it is a probability density graph (probability = area under the curve)
  • total area under the curve = 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Gaussian Distribution:

  • 1 SD, 2 SD, 3 SD&raquo_space; chance of value occurring?
  • SDT warning!
A

within:
1 SD: 68%
2 SD: 95%
3 SD: 99%

  • when calculating SDT, the percentages exclude the tails&raquo_space; beware!
  • sampling distribution is based on the assumption that H0 is TRUE!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Uniform Distribution

A
  • each event is equally likely (eg. throwing a fair dice = 1/6 probability)&raquo_space; discrete
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Poisson Distribution

- few events vs many events

A
  • usually positively skewed
  • used when random events occur at a certain rate over a fixed time period (eg. hourly # of customers at a bank)
  • if expecting few events, it will be positively skewed
  • if there are more events, distribution will become more symmetric
18
Q

z-score

A
  • difference in score as a proportion of the SD (with respect to the population!)&raquo_space; units of SD
    (basically, how many SD units away from the mean you are)
  • if you are ranking one score within popn:
    (x-u)/sigma
  • similar to d’
  • if you are finding the sample distribution of the sample mean
    (Xavg - u)/(sigma/sqrt(n))
19
Q

“the standard normal”

A

distribution of z scores

- M = 0 and SD = 1 (same for t- score!)

20
Q

percentile rank

  • how is it similar to z-score?
  • what if its gaussian?
A

percent measurements of score value in the distribution below that value (eg. a score in the 99th percentile = 99% of all scores are below)

  • both z-score and percentile rank both look at relative standing

if it’s gaussian we can calculate based on z-score (eg. z=1&raquo_space; one SD from the mean (M = 50)&raquo_space; add 34 to 50 = 84%)

21
Q

how to calculate percentile from standard normal distribution table

  • what if z score is negative?
  • what does percentile represent?
A
  • find first 2 digits in the first column, find third digit in the first row
  • if z score is negative, do 1 - (positive percentile)
  • percentile represents area before (left of) the z-score
22
Q

Sir Francis Galton: CLT

- rule of thumb

A

central limit theorem: if x is the sum of identically distributed (uniform) variables, with a non-zero SD, then the distribution of x will approach gaussian

  • rule of thumb: mean distribution will be gaussian if n>30
  • even if the original data is skewed, the avg will be gaussian
23
Q

effect size

- small, med, large

A

describes relationship among variables in terms of size/amt/strength&raquo_space; descriptive&raquo_space; shows extent to which results are meaningful

effect size based on Cohen’s d:

small: 0.2
med: 0.5
large: 0.8

24
Q

purposes of inferential stats (2)

A

parameter estimation: estimate value of population parameter based on random sample

hypothesis testing: whether effect occurred by chance or not (probability)

25
Q

sampling distribution of Xi vs Xavg

- which has smaller SD?

A
Xi ~ N(u, sigma) 
Xavg ~ N(u, sigma/sqrt(n))
- N = normal(gaussian) distribution
- u is always the same (population)
- Xavg has smaller SD because it is divided by sqrt(n) >> gets smaller as sample size increases
26
Q

confidence intervals

  • inc n? sample mean?
  • 95%?
  • formula?
A

as n increases, our confidence in the sample mean increases (CI decreases)

  • sample mean gets closer to popn mean w increased sample size
  • typically CI is 95%&raquo_space; randomly picking a number will have a 95% chance of being within 2SD of u, so shifting that 2SD so that the value picked is the center indicates that there is a 95% chance that u is within 2SD of the value = CI
  • CI = Xavg +/- z*sigma/sqrt(n)
27
Q

finding z-score from CI

- 95% vs 99% CI z score difference?

A
  • pick CI (eg. 95%)
  • add in one of the tails (because 95% is the 2SDs so you need the end of one of the tails since only one tail is excluded by z score)&raquo_space; 1-0.95 = 0.05 (both tails)&raquo_space; one tail = 0.025
  • find 0.975 on z-score table&raquo_space; find z score
  • the more confidence you want (eg. 95 vs 99%) the bigger your z score will be
28
Q

student’s t distribution

  • when do you use t?
  • is z or t wider?
  • effect of sample size?
  • df?
  • CI?
A
  • looks the same as gaussian except values don’t extend to inf (not gaussian!)
  • used when you don’t have the population SD (sigma), and you only have a sample SD
  • in general, t will have a wider distribution because it is less exact&raquo_space; more uncertainty
  • unlike z, t distribution depends on sample size: as you increase n, t gets closer to gaussian (approximates normal distribution)
  • defined by degrees of freedom (n-1)

CI = Xavg +/- t*s/sqrt(n)

s/sqrt(n) is standard error of the sample mean!

29
Q

finding t score from CI

- what does increasing sample size do?

A
  • t score is a single tail’s probability while z score is everything other than one tail&raquo_space; find the value of one tail based on CI (eg. 95%&raquo_space; one tail = 0.025)
  • find degrees of freedom (n-1)
  • find corresponding t value
  • increasing df (inc sample size) will decrease t value&raquo_space; approximates gaussian ( eg. as df approaches inf, t approaches 1.96, which is the z score for gaussian 95 CI)
30
Q

error bars

  • narrow vs wide?
  • common error bars?
A

narrow error bar = increased confidence
wide error bar = lots of noise, less confidence
Common error bars:
- range
- SD&raquo_space; will not change, no matter the size of the sample, if it represents the population SD
- SE&raquo_space; most common in BNS (SD/sqrt(n))&raquo_space; decreases as n increases
- CI&raquo_space; recommended&raquo_space; like a stat test&raquo_space; Xavg +/- tSE&raquo_space; based on t and SE, so increasing sample size, which affects both of these, will decrease CI by a lot.

31
Q

H0 vs H1

A

H0: null hypothesis
H1: research hypothesis

32
Q

p > a vs. p < a

A

p > a: retain H0&raquo_space; fail to reject&raquo_space; results are not significant
p < a: not significant

33
Q

probability density graph for p value vs alpha level

A
  • if value (observation) is greater than criterion, reject H0&raquo_space; greater obs value = less area under curve = smaller p value
  • area under the curve and observation line = p value
  • area under the curve and criterion = alpha level (typically 0.05&raquo_space; cannot be greater/more liberal)
    - criterion will change based on alpha
34
Q

Standard error

- t value?

A
  • if you used sample SD and not population SD, you will have a t distribution&raquo_space; SE will be the standard dev of the sampling distribution of the sample mean Xavg
  • since Xavg dist is gaussian, SE can be used to approx CI (eg. SE = 2&raquo_space; 95% CI)
  • t value = (Xavg - u)/SE
  • find t value of data (obs) and compare to tcrit from table
  • if t is bigger than tcrit then you reject H0
    - t distribution is for the null hypothesis&raquo_space; big t is
    unlikely therefore we would reject H0
35
Q

one sample t test:

  • 1 tail
  • 2 tail
  • is it harder to reject 1 tail or 2 tail?
A

1 tail: directionality specified, all of probability a is in one tail only

two tails: directionality not specified (can be bigger or smaller than value you are comparing to); must divide a by 2 because it can be either of the tails

  • harder to reject 2 tails because there is possibility for error on either side, so tcrit (dependent on alpha) will be further from the mean; less likely to get data that is further from the mean
36
Q

95% CI from t value (2 tails)

A
  • if you’re finding 95% CI based on t, you must use the NON-DRECTIONAL t value because it takes both tails into account
  • a 2 tailed one sample t test is equivalent to asking if a value is within the 95% CI
  • CI = Xavg +/- t*SE&raquo_space; calculate the interval
37
Q

does failing to reject the H0 prove it’s correct?

A

NO. It only shows that the data is consistent w the H0. Study could just be underpowered and not show the effect because it couldn’t detect the difference

38
Q

power

  • correct decisions (2)
  • type 1 error
  • type 2 error
A

probability of making a correct decision of rejecting an incorrect H0 (assuming it is actually false)

  • correct decision:
    - H0 is incorrect and you reject H0 (hit)
    - H0 is correct and you retain H0 (correct reject)
  • type 1 error (a): H0 is correct and you reject H0 (false alarm); occurs when you get a large t value just by chance
  • type 2 error (B): H0 is incorrect and you retain H0 (miss)
    - B depends on alpha level (if a is too low it increases chance of type 2 error&raquo_space; miss), sample size (inc sample size dec chance of type 2), and effect size (a small effect would go undetected&raquo_space; inc chance of type 2 error&raquo_space; depends on cohen’s d)

(reject H0 = yes, retain H0 = no; H0 incorrect = signal present, H0 correct = signal absent)

  • Power = 1-B
39
Q

How can we increase power? (3)

A
  1. increase P(type 1 error)&raquo_space; would increase false alarm rate, and misses would go down
    - not helpful: type 1 error (a) cannot exceed 5%
  2. increase separation between the sample mean and the population mean (H0 value)
    - not helpful: this is not possible. you can’t change the data
  3. increase the sample size
    - this is helpful! increasing n will decrease SE (spread)&raquo_space; tcrit is moved closer to the population mean&raquo_space; easier to reject the H0&raquo_space; increases power

*even if you reject H0, you are not proving H1 correct&raquo_space; other hypotheses that align w data could also be correct

40
Q

Choice of statistical test

  • scale of measurement: ratio/interval or categorical
  • # of groups/levels of IV: 1, 2, 3+
  • experimental design: within or between subjects
A
  • scale of measurement
    1. ratio/interval: t test/ANOVA
    2. categorical: chi-square
  • # of groups/levels of IV
    1. 1 group = sample t test
    2. 2 groups = 2 sample t test
    3. 3+ groups = ANOVA
  • experimental design
    1. within sub: paired t test, repeated measures ANOVA
    2. between sub: 2 sample t test, one way ANOVA