L5 - Critical thinking about statistical inference Flashcards

Question

# 2.Conditional probabilities What is a third way of understanding why we cannot infer that null is false, just from obtaining p-value lower than alpha?

Answer 1

In an ideal (unrealistic) world, we know the base rate of our null hypothesis, sensitivity (power) and specificity But in real world, we don't know the base rate (picture 3) but we can do a thought process where we assign the base rate based on the past knowledge from already conducted studies

Answer 2

We have 1000 hypothesis and 100 turn out that null is true and 900 turn out that H0 is false. That's out base rate - The probability of drawing one hypothesis a random and it being true or false - That is an objective interpretation = the base rate of the null hypothesis being true is a proportion of true null hypothesis vs false null hypothesis We give a value to sensitivity (0.8 common in psychology) and specificity (0.95 common in psych.) and calculate the probability of observing real effect given that we rejected the null hypothesis Ex1: Base rate = 0.9 P (real ffect|reject 0) = 0.99 The sensitivity and specificity remains constant Ex2: Base rate = 0.1 P (real effect|reject H0) = 0.64 The probability changed depending on the base rate. P (real effect|reject H0) > 0.5 so should be forceful? No, becuase it also depends on sensitivity and specificity so if these change, the probability changes as well (picture 5)

Answer 3

Because rejection of the null is just based on the specificity and inversing conditional probability means that we need sensitivity, specificity and base rate So in this argument we only have specificity, not sensitivity and base rate P1) If H0 is true, probably not this data P2) This data C) H0 is not true

Answer 4

Saying that probability of making an error is alpha is not correct because we don't know whether the null hypothesis is true or not and the alpha assumes the null is true - It doesn't say anything about Type II error (which looks at the probability of making an error when H0 is false)

Answer 5

Look at picture 6 When we talk about rejecting H0, we look at the circled part of the frequency trees, so not just rejecting it if H0 was true

Answer 6

Same as saying a non-significant result means that the H0 is true

Answer 7

So that they could say something about the sensitivity (power) of their analysis - Because if we collect infinite number of data we will eventually reach significance that's why you want to look at the power of the test to see whether the effect is actually happening regardless of the number of observations A strict application of their logic means setting the risks of both Type I and II errors (α and β) in advance before collecting the data

Answer 8

1. estimating effect size we're interested in 2. estimate data variance ↪ Do this based on knowledge from past studies about the same concept or do a pilot study Determining these two, a table can tell us how many participants we need to have to keep β at our predetermined level

Answer 9

Absence of evidence - the experiment did not yield a conclusive result, perhaps because too few observations were taken Evidence of absence - the experiment did yield a conclusive result, but it favours the null hypothesis The p-value cannot discriminate between the two even though the evidence of absence offers much more evidence for the H0

Answer 10

P1) Study 1 finds an effect of size X with Z participants P2) Study 2 is a direct replication of 1 with Z participants C) Study 2 is sufficiently powered

Answer 11

We should increase the number of participants in the second experiment to account for inflated significant results, sampling variability, subtle contextual differences, publication bias and regression to the mean

Answer 12

**Power** - probability of correctly rejecting the null hypothesis when it is false (i.e. avoiding Type II error); measure to detect effect if there is one **Sensitivity** - the ability of a test to correctly identify true positives - Power applies more broadly to hypothesis testing whereas sensitivity relates to the performance of a test - In hypothesis testing, power is analogous to sensitivity in that they both refer to correctly identifying true positives, but they are used in slightly different contexts

Answer 13

1. Because lack of sensitivity: underpowered studies make for inconclusive replication attempts (49% of replications were *inconclusive* but are often reported as conclusive failures to replicate) 2. Because of lack of differentiation: is the found effect in the replication meaningfully different from the original?

Answer 14

P1) Manipulation X has an effect P2) There’s no significant difference between conditions in the degree to which participants noticed manipulation X C )The effect of manipulation X was not noticed If we dare to say that there is an effect, we have to make sure to say what our sensitivity of the test is (the higher sensivity, the higher the power)

Answer 15

P1) P<.05 C) I have found an effect **(** P2) When I have found an effect, it is no longer relevant what the probability is that I find an effect if H0 is not true**)** C2) Power is not relevant

Answer 16

The informativeness of rejecting the null is affected by sample size and by power ↪ When power or sample size increase, so does the P (Real effect|reject H0) and vice versa - That's the difference between making inferences about a hypothesis being true/false and deciding on a course of action (rejecting null with error rate that's established for the long term) → we didn't show an effect, we rejected the null - Very small or unimportant effects will be statistically significant if sufficiently large amounts of data are collected, and very large and important effects will be missed if the sample size is too small

Answer 17

(1) and (3) - stats never allow for absolute proof or disproof (2) and (4) - refer to the probability of hypotheses which cannot be correct since objective probability refers to a collective of events, not the truth of a hypothesis (5) - refers to the probability of a single event being correct which can't be since objective probability doesn't refer to a single event (6) - description of power, not significance

Answer 18

Those rules are defined by the conditions under which you will stop collecting data for a study - They should be defined before hand by your sampling plan - with how many participants you will work

Answer 19

1. Run the first number of participants and if significance still wasn't reach, run additional participants - we are now doing two different significance tests, so inflating the α level 2. Continuing running until the test is significant = even assuming H0 is true, you will eventually obtain a 'significant' result if you run the data for long enough ↪ Although has a power of 1, it also has an α of 1!

Answer 20

They came up with a standard stopping rule which is to use power calculations in advance to determine how many participants should be used to control power - to determine our sampling plan - Both α and β can then be controlled at known acceptable levels

Answer 21

If we conduct one t-test, the probability that it is significant by chance alone is 0.05 if we test at the 0.05 level If we conduct two t-tests, the probability that at least 1 is significant by chance alone is slightly less than 0.10

Answer 22

In order to control Type I error, if we perform a number of tests, we need to test each one at a strickter level of significance in order to keep overall alpha level at 0.05 - This is done by applying a correction, e.g. Bonferroni which conducts each individual test at the 0.05/k (number of hypotheses/comparisons) level of significance and overall alpha will be no higher than 0.05

L5 - Critical thinking about statistical inference Flashcards

(47 cards)