Module 3 Flashcards

Question

How do we decide what is a small p-value, and a large p-value?

Answer 1

By determining the significance level of the test. Choosing where to set this level depends on the manager, or, the test user. However, although the significance level depends on the user, it is a standard practice to set it at 10[%], 5[%] ,or, 1%.

Answer 2

The p-value will be considered small, if it is below the alpha limit. This leads to the rejection of H0, in favor of H1.

Answer 3

The large p-values are those exceeding alpha, and, who do not allow H0 to be rejected.

Answer 4

Type I Error (False Positive): Acting on a change that doesn’t truly improve the experience. In UX, this might mean deploying an ineffective design. = When H0 is rejected, while it's true.

Answer 5

Type II Error (False Negative): Missing an opportunity to make a beneficial change. This might mean sticking with a suboptimal design. = When H0 is not rejected, when H1 is true.

Answer 6

For the calculation of the p-value, we assume the scenario where H0 is true in the population.

Answer 7

It occurs if we reject H0. The decision to reject H0 is made only if the observed p-values is less than alpha. Thus, the p-value is lower than alpha when a type I error occurs.

Answer 8

The rejection region of the null hypothesis (H0) is the range of test statistic values that lead to rejecting H0. It plays a critical role in deciding whether observed UX data provides enough evidence to favor the alternative hypothesis H1.

Answer 9

H0 typically represents the assumption that there is **no effect** or **no difference**. **Examples in UX:** There is no difference in conversion rates between two designs. A new navigation menu does not reduce task completion time.

Answer 10

H1 represents what you're trying to demonstrate (e.g., **that there is an effect** or **a difference**): **Examples in UX:** The new design increases conversions. The navigation menu reduces task completion time.

Answer 11

Defining the Rejection Region The rejection region is based on: Significance Level (α): The threshold for how much risk you're willing to accept for a Type I error (rejecting H0, when it's true). Common values are α=0.05\alpha = 0.05α=0.05 (5%), α=0.01\alpha = 0.01α=0.01 (1%) α=0.10\alpha = 0.10α=0.10 (10%). Example: At 𝛼 = 0.05 α =0.05, you're willing to accept a 5% chance of incorrectly rejecting H0.

Answer 12

A lower risk of type I error is associated with an increased risk of type II error.

Answer 13

A hypothesis test is a statistical procedure that helps us choose between two opposite hypotheses, regarding a population.

Answer 14

The p-value is useful for capturing the information in the sample data with the test, and it is the p-value that the test decision is based on.

Answer 15

The calculation of the p-value assumes that H0 is true in the population.

Answer 16

If the conclusion is weak, we keep H0, because of the lack of evidence that we should do the opposite.

Answer 17

Knowing that we conclude in favor of H1, when the data provide statistical evidence against H0, we will then give the test's result by saying ''we reject H0''. In order to interpret, we will say something such as: **The data observed provides statistical evidence that H0 is false, while providing what H0 corresponds to in the specific application context.** It would be acceptable to mention statistical evidence that H1 is true.

Answer 18

The Welch's test has certain validity conditions, notably the independence of the two samples used.

Answer 19

If we're unable to verify the validity conditions of a test, it may be because the test under consideration is not compatible with the structure of the data collected and that another test might be more appropriate!

Answer 20

It would be inappropriate to look at the sampled data and use them to conduct a one-sided test, whose hypotheses would be dictated by what has been observed in the data. Indeed, the significance level of the test would be biased; it would not be the alpha value

Answer 21

When the test leans towards H1, it is a **strong** statement. Indeed, it means the data provide the statistical evidence that H1 is true.

Answer 22

We will say ''we reject the null hypothesis'', ''we reject H0". "we do not reject H0", " ## Footnote In the report: "The data demonstrate that...". "we do not reject H0", "we do not reject the null hypothesis", instead of "we accept the null hypothesis", or "we accept H0". As for the interpretation, it must be clear that the result is weak: we will mention in the report "the data do not allow us to...", and so on.

Answer 23

Indeed, the maximum probability that a type one error has been made by the test is equal to the alpha significance level used.

Answer 24

By setting alpha to one, five or 10%, the manager restricts the level of risk of a type one error to this percentage.

Answer 25

No direct control is made of type two errors so we to take a look at the power, which is defined as one minus the probability of making a type two error.

Answer 26

1. The test does not reject H0 and a type two error occurs 2. The test rejects H0, which is the right decision.

Answer 27

The power is the complementary probability to the probability of a type two error of occurring. The power measures the tests ability to make the right decision precisely to detect that H1 is true when it's the case in the population.

Answer 28

* The Significant levels used * The order of magnitude of the estimation error committed * The size of the gap between the actual situation, i.e. what is true in the population and what is assumed for H0

Answer 29

A high power is associated with a low type two error risk.

Answer 30

Lowering the risk of type two error means a higher risk of type one error. Higher power and high risk of type one error go together.

Answer 31

To achieve higher power, the user of the test would need to allow a higher risk of type one error by choosing a larger alpha value But accepting a higher risk of type one is not very appealing. We would like to keep it as low as possible. The tests users should find another alternative to increase the power of the test if it is insufficient at the alpha level, which he is comfortable with

Answer 32

If the estimation error tends to be small, the risk that the test will make an error is small and therefore the power is large.

Answer 33

Since the risk of type one error is controlled in a test by this alpha significance level, it is the type two error that is impacted by a high level of estimation error.

Answer 34

Many of these factors are beyond the control of the test user. For example, when comparing averages, the amount of variation in the variables understudy as measured by the standard deviation of these variables plays a key role. The greater the variation in the variables, the more difficult it is to estimate the average accurately, which increases the risk of error in the test.

Answer 35

By using larger sample sizes, the estimation error is likely to be smaller; hence the type two error risk is small and the power is large. In the opposite direction, the smaller samples are likely to have a larger estimation errors leading to a greater type two error risk and a lower power.

Answer 36

The power measures the capacity of the test to correctly favor H1 when this hypothesis is true in the population. If there is a large gap between the real situation in the population and what is supposed in H0, then it is easy to detect its discrepancy with a test and therefore the power is high. Conversely, if the gap between the actual situation in the population and what is supposed in H0 is small, then it is difficult for a test to detect this deviation and therefore the power is small.

Answer 37

For the same alpha significance level, a small sample will be enough to detect a large gap between reality and what is stated in H0 for a given power. Whereas a large sample is needed to detect a small gap between reality and what is stated in H0 with the same power level.

Answer 38

The greater the ability of a test to detect that H0 is false when the sample size increases.

Answer 39

Type 2 error: This is a possible pitfall when the test does not reject H0: failing to detect that H1 is true What is reassuring, is that the probability we are committing a type I error is small. It is in fact equal to alpha, at the most.

Answer 40

* Sample Size * Alpha ( Significance Level ) * P-Value ( Standard Deviation ) * Mean ( Average ) * Graphical Representation

Answer 41

The parameter p is often used instead of µ to indicate a proportion. ## Footnote Note that is has no effect on the results because the representation of the parameter is arbitrary and, furthermore, we have seen that a proportion is also an average anyway!

Answer 42

α (Alpha)

Module 3 Flashcards

(67 cards)