Module 3 Flashcards

1
Q

What is a Hypothesis test?

A

A hypothesis test is a statistical procedure aiming
to decide if a statement, called the null hypothesis is plausible according to the data of a sample, or if it must be rejected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 5 test procedures to test a hypothesis?

A
  1. Formulate the hypotheses
  2. Calculate the p-value
  3. Make a decision
  4. Ensure the test validity
  5. Appropriately interpret the result
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the hypothesis formulated?

A

There are two opposite statements
called H0 and H1, between which we will have to decide.

H0 is often called the null hypothesis, and H1, the alternative hypothesis.

Hypotheses are always statements regarding
the population studied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is the p-value important?

A

The p-value is important because it is the measure which the decision rule is based on, whether we decide in favor of H0 or H1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how to assess the importance of the difference between H0 and H1?

A

We compute the ratio of the observed difference between the sample data and what is expected under H0 over the estimated standard
deviation of this measured difference in the numerator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define a test statistic?

A

It is the standardized measure, which allows us to judge the difference between the data and H0.

The test statistic is a measure that distinguishes between H0 and H1 that is corrected for the estimation error

The value of the test statistic is usually calculated using software.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does a large test statistic or distance suggest?

A

If the test statistic or distance is large, meaning that what is observed in the sample is far from what is expected under H0, it is a sign that H0 is probably false.

This suggests we conclude in favor of H1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does a small test statistic or distance suggest?

A

If the distance is small, the data is compatible with H0 and it makes sense to keep the assumption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can the values of a test statistic be described?

A

The values of the test statistic can be described by a probability distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When will the deviation of the data from H0 will be judged.

A

It is on the basis of the probability distribution of the sample distance
under H0, that the deviation of the data from H0 will be judged.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does it likely mean If the distance, or test statistic, calculated from the data we observed, is situated in the tail ends of the distribution?

A

This means our data are unlikely under H0.

This justifies the conclusion in favor of H1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does it mean if the test statistic is in the center of the curve?

A

If the test statistic that we calculate is in the center of the curve,
this means our data are compatible with H0.

Therefore, there is no reason to reject the hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we universally interpret the test statistic?

A

The distribution varies between each test statistic.
There are no universal markers to interpret the test statistic.
Fortunately, it is possible to convert to another measure which is universal, the p-value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the P-Value?

A

The P value is a measure of the compatibility between the observed data and H0. It is a probability.

Unlike the test statistic, the P value is interpreted exactly
the same for all statistical tests.

The P value is usually calculated by a software.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is the P-Value calculated?

A

The P value is usually calculated by a software.

To calculate, we assume H0 is true for the population.

The P value is defined as the probability of observing data that is equivalent to or either further from what is expected for H0, than what we actually observed, should the data collection be repeated and the same test procedure applied.

Hence, it is the probability that new data collected would be further away from H0 than the measured distance -2.31 (In the bank example) .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain how the P-value functions

A

If H0 is true in the population, then, if we applied exactly the same methodology to conduct a new study, there would be a 2.3% chance
of obtaining data equivalent to or further away from H0
than the data
we observed in the current study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does a small P-Value indicate?

A

A small P value indicates that the data collected is unlikely for the
scenario where H0 is true in the population, which is statistical
evidence against H0 and must lead to its rejection.

This suggests we reject H0 in favor of H1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does a large P-Value indicate?

A

A large P value indicates that the data collected is plausible for the scenario where H0 is true.

It cannot be rejected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does mu (μ) represent?

A

It represents the mean of a population, or the expected value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does n represent?

A

It represents the sample size, or the number of observations or participants in a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is H0 and H1

A

In hypothesis testing there are two mutually exclusive hypotheses; the Null Hypothesis (H0) and the Alternative Hypothesis (H1).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

We only need to examine H1 to determine if the test is one sided or two sided? true or false?

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What indicates a one sided test?

A

A “smaller” or “larger” sign indicates a one-sided test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What indicates a 2 sided test?

A

A “different from” sign indicates a two-sided test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How do we decide what is a small p-value, and a large p-value?

A

By determining the significance level of the test.

Choosing where to set this level depends on the manager, or, the test user.

However, although the significance level depends on the user, it is a standard practice to set it at 10[%], 5[%] ,or, 1%.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

When will the p-value be considered small?

A

The p-value will be considered small, if it is below the alpha limit.

This leads to the rejection of H0, in favor of H1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

When will The p-value be considered large?

A

The large p-values are those exceeding alpha, and, who do not allow H0 to be rejected.

28
Q

What is a type 1 error?

A

Type I Error (False Positive): Acting on a change that doesn’t truly improve the experience. In UX, this might mean deploying an ineffective design.

= When H0 is rejected, while it’s true.

29
Q

What is a type 2 error?

A

Type II Error (False Negative): Missing an opportunity to make a beneficial change. This might mean sticking with a suboptimal design.

= When H0 is not rejected, when H1 is true.

30
Q

What do we assume when calculating the p-value?

A

For the calculation of the p-value, we assume the scenario where H0 is true in the population.

31
Q

When is a type 1 error exposed?

A

It occurs if we reject H0.

The decision to reject H0 is made only if the observed p-values is less than alpha.

Thus, the p-value is lower than alpha when a type I error occurs.

32
Q

What is the rejection region for H0?

A

The rejection region of the null hypothesis (H0) is the range of test statistic values that lead to rejecting H0. It plays a critical role in deciding whether observed UX data provides enough evidence to favor the alternative hypothesis H1.

33
Q

What does The Null Hypothesis (H0) assume?

A

H0 typically represents the assumption that there is no effect or no difference.

Examples in UX:
There is no difference in conversion rates between two designs.

A new navigation menu does not reduce task completion time.

34
Q

What does The Alternative Hypothesis (H1) assume?

A

H1 represents what you’re trying to demonstrate (e.g., that there is an effect or a difference):

Examples in UX:

The new design increases conversions.

The navigation menu reduces task completion time.

35
Q

How is the rejection region defined?

A

Defining the Rejection Region
The rejection region is based on:

Significance Level (α): The threshold for how much risk you’re willing to accept for a Type I error (rejecting H0, when it’s true).

Common values are
α=0.05\alpha = 0.05α=0.05 (5%),
α=0.01\alpha = 0.01α=0.01 (1%)
α=0.10\alpha = 0.10α=0.10 (10%).

Example: At
𝛼 = 0.05
α =0.05, you’re willing to accept a 5% chance of incorrectly rejecting H0.

36
Q

What is a lower risk of type 1 error associated with?

A

A lower risk of type I error is associated with an increased risk of type II error.

37
Q

What is a hypotheses test?

A

A hypothesis test is a statistical procedure that helps us choose between two opposite hypotheses, regarding a population.

38
Q

What is the test decision based on?

A

The p-value is useful for capturing the information in the sample data with the test, and it is the p-value that the test decision is based on.

39
Q

What does the calculation of the p-value assume?

A

The calculation of the p-value assumes that H0 is true in the population.

40
Q

What do we do when the conclusion is weak?

A

If the conclusion is weak, we keep H0, because of the lack of evidence that we should do the opposite.

41
Q

How can we give the tests results?

A

Knowing that we conclude in favor of H1, when the data provide statistical evidence against H0, we will then give the test’s result by saying ‘‘we reject H0’’.

In order to interpret, we will say something such as:
The data observed provides statistical evidence that H0 is false, while providing what H0 corresponds to in the specific application context.

It would be acceptable to mention statistical evidence that H1 is true.

42
Q

What Validity conditions does the Welch’s test have?

A

The Welch’s test has certain validity conditions, notably the independence of the two samples used.

43
Q

What is the possible reason for not being able to verify the validity conditions?

A

If we’re unable to verify the validity conditions of a test, it may be because the test under consideration is not compatible with the structure of the data collected and that another test might be more appropriate!

44
Q

Why is it inapropriate to observe the data before we have identified the hypotheses?

A

It would be inappropriate to look at the sampled data and use them to conduct a one-sided test, whose hypotheses would be dictated by what has been observed in the data. Indeed, the significance level of the test would be biased; it would not be the alpha value

45
Q

What does it mean when the test leans towarsd H1?

A

When the test leans towards H1, it is a strong statement. Indeed, it means the data provide the statistical evidence that H1 is true.

46
Q

How can we formulate a finding when h1 is true?

A

We will say ‘‘we reject the null hypothesis’’, ‘‘we reject H0”.

“we do not reject H0”, “

In the report: “The data demonstrate that…”.
“we do not reject H0”, “we do not reject the null hypothesis”, instead of “we accept the null hypothesis”,
or “we accept H0”.

As for the interpretation, it must be clear that the result is weak: we will mention in the report
“the data do not allow us to…”, and so on.

47
Q

What is the maximum probability equal to?

A

Indeed, the maximum probability that a type one error has been made by the test is equal to the alpha significance level used.

48
Q

How is a type 1 error controlled?

A

By setting alpha to one, five or 10%, the manager restricts the level of risk of a type one error to this percentage.

49
Q

How is a type 2 error controlled?

A

No direct control is made of type two errors so we to take a look at the power, which is defined as one minus the probability of making a type two error.

50
Q

When H1 is true two things can happen =

A
  1. The test does not reject H0 and a type two error occurs
  2. The test rejects H0, which is the right decision.
51
Q

What is the Power of the Hypothesis Test?

A

The power is the complementary probability
to the probability of a type two error of occurring.

The power measures the tests ability to make the right decision
precisely to detect that H1 is true when it’s the case in the population.

52
Q

What are the dedicated factors to influencing the power of the hypothesis test?

A
  • The Significant levels used
  • The order of magnitude of the estimation error committed
  • The size of the gap between the actual situation, i.e. what is true in the population and what is assumed for H0
53
Q

What is a high power associated with?

A

A high power is associated with a low type two error risk.

54
Q

Define how the risks of making type one and type two errors vary in opposite directions when we conduct a test

A

Lowering the risk of type two error means a higher risk of type one error. Higher power and high risk of type one error go together.

55
Q

How can we achieve higher power of a type one error ?

A

To achieve higher power, the user of the test would need to allow a higher risk of type one error by choosing a larger alpha value

But accepting a higher risk of type one is not very appealing. We would like to keep it as low as possible.

The tests users should find another alternative to increase the power of the test if it is insufficient at the alpha level, which he is comfortable with

56
Q

What does the power tend to be when the estimation error is small?

A

If the estimation error tends to be small, the risk that the test will make
an error is small and therefore the power is large.

56
Q

Which type of error is impacted by a high level of estimation error.

A

Since the risk of type one error is controlled in a test by this alpha significance level, it is the type two error that is impacted by a high level of estimation error.

57
Q

What does the magnitude of the estimation error depend on?

A

Many of these factors are beyond the control of the test user. For example, when comparing averages, the amount of variation in the variables understudy as measured by the standard deviation of these variables plays a key role.

The greater the variation in the variables, the more difficult it is to estimate the average accurately, which increases the risk of error in the test.

58
Q

How does the sample size effect the power?

A

By using larger sample sizes, the estimation error is likely to be smaller; hence the type two error risk is small and the power is large. In the opposite direction, the smaller samples are likely to have a larger estimation errors leading to a greater type two error risk and a lower power.

59
Q

What does the power measure?

A

The power measures the capacity of the test to correctly favor H1 when this hypothesis is true in the population.

If there is a large gap between the real situation in the population and what is supposed in H0, then it is easy to detect its discrepancy with a test and therefore the power is high.

Conversely, if the gap between the actual situation in the population and what is supposed in H0 is small, then it is difficult for a test to detect this deviation and therefore the power is small.

60
Q

How does controlling the sample size effect the power?

A

For the same alpha significance level, a small sample will be enough to detect a large gap between reality and what is stated in H0 for a given power.

Whereas a large sample is needed to detect a small gap between reality and what is stated in H0 with the same power level.

61
Q

A greater sample size has the anbility to?

A

The greater the ability of a test to detect that H0 is false when the sample size increases.

62
Q

What are the possible pitfalls when the test does not reject H0?

A

Type 2 error: This is a possible pitfall when the test does not reject H0: failing to detect that H1 is true

What is reassuring, is that the probability we are committing a type I error is small.

It is in fact equal to alpha, at the most.

63
Q

Which information does a good description of the data provide?

A
  • Sample Size
  • Alpha ( Significance Level )
  • P-Value ( Standard Deviation )
  • Mean ( Average )
  • Graphical Representation
64
Q

What does the parameter p represent?

A

The parameter p is often used instead of µ to indicate a proportion.

Note that is has no effect on the results because the representation of the parameter is arbitrary and, furthermore, we have seen that a proportion is also an average anyway!

65
Q

If the probability of making a type I error is the probability of rejecting H0 when it is true. We can establish that this probability is at most ______?.

A

α (Alpha)

66
Q
A