Final exam revision Flashcards

1
Q

What is an alpha level?

A

Also known as a significance level
Probability of rejecting the null hypothesis when it is true.
The alpha level is typically set at 0.05, which means that there is a 5% chance of incorrectly rejecting the null hypothesis
A lower alpha level indicates a more stringent test, meaning that there is a lower chance of a Type I error (a false positive). However, a lower alpha level also means that there is a lower chance of detecting a real difference or relationship (a Type II error).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a type I error?

A

False-positive: occurs if an investigator rejects a null hypothesis that is actually true in the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a type II error?

A

False-negative: This occurs if the investigator fails to reject a null hypothesis that is actually false in the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s the difference between one-sided and two-sided tests?

A

One-sided has a direction in which we expect the data to go, while a two-sided test does not specify a direct of difference.

For example, if we were testing a new drug to see if it was effective in reducing blood pressure, we might only be interested in detecting a difference in the mean blood pressure if it was lower in the group that took the drug. In this case, we would use a one-sided test to reduce the risk of a Type I error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a measurement unit?

A

The units in which data is measured.
For example, height can be measured in centimetres or inches, weight can be measured in kilograms or pounds, and time can be measured in seconds, minutes, or hours.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an experimental unit?

A

The individuals or objects that are studied in an experiment.
For example, if you are conducting an experiment to test the effects of a new fertilizer on plant growth, the experimental units would be the plants.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Write down a linear regression model

A

y = b0 + b1x + e

where:

y is the dependent variable
b0 is the intercept
b1 is the slope
x is the independent variable
e is the error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you reduce type II error

A

False-negative
1. Increasing the sample size. A larger sample size will increase the
power of the test, which is the probability of rejecting the null
hypothesis when it is false.
2. Increasing the effect size. A larger effect size will also increase the
power of the test.
3. Using a more sensitive test. There are a number of different
statistical tests that can be used to compare two groups. Some tests
are more sensitive to differences between the groups than others.
4. Reducing the variability in the data. Variability in the data can reduce
the power of the test. This can be done by controlling for
confounding variables or by using a more precise measurement
instrument.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you reduce type I error

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the effect size in statistics

A

A measure of the strength of the relationship between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you increase effect size in statistics

A

Generally, effect size is calculated by taking the difference between the two groups (e.g., the mean of treatment group minus the mean of the control group) and dividing it by the standard deviation of one of the groups.
So lowering the standard deviation would end up with a higher effect size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define t-student distribution

A

Test to compare the means of two groups.
It is a bell-shaped distribution, but it has heavier tails than the normal distribution.
The t-distribution is a family of distributions, and the shape of the distribution depends on the degrees of freedom.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What’s degrees of freedom?

A

The degrees of freedom are a measure of the variability of the data. The larger the degrees of freedom, the more closely the t-distribution resembles the normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an absolute value of a critical value?

A

The absolute value of a critical value is the distance from the mean of a distribution to the point at which the probability of observing a value at least as extreme is equal to the significance level. In other words, it is the point at which the null hypothesis can be rejected.
For example, if the significance level is 0.05, then the critical value for a two-tailed test is 1.96. This means that if the absolute value of the test statistic is greater than 1.96, then the null hypothesis can be rejected with a 95% confidence level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the linear model assumptions?

A

Linearity: The relationship between the independent and dependent variables is linear. This means that the data points should form a straight line when plotted on a scatter plot.

Homoscedasticity: The variance of the residuals is constant across all values of the independent variable. This means that the spread of the data points around the regression line should be the same for all values of the independent variable.

Normality: The residuals are normally distributed. This means that the residuals should follow a bell-shaped curve.
Independence: The residuals are independent of each other. This means that the value of one residual should not be related to the value of any other residual.

Multicollinearity: There is no multicollinearity among the independent variables. This means that the independent variables are not perfectly correlated with each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

why do we replicate?

A

Helps to reduce the effects of random error.
Replication can help to identify systematic errors
Replication can help to build confidence in the results.

17
Q

How do you calculate DFres?

A

n - k
where;
N = sample size
K = levels within predictor variable

18
Q

how do you calculate MS

A

SS / df

19
Q

how do you calculate f-value

A

MSgroups / MSres

20
Q

In a coefficient table, how do you calculate t-value?

A

estimate / std.error

21
Q

how do you calculate if the model fits the data? Rsq

A

r. sq = 1 - (SSres / SStotal)
r.sq values range from 0-1, the closer to 1 the better the fit.

22
Q

What are blocking factors?

A

Blocking factors:
Variables that are known to affect the outcome of the experiment, but that are not of primary interest to the researcher.

Eg. In a study of the effects of a new drug on blood pressure, researchers might block the participants by age group. This would help to ensure that the effects of age are not confounded with the effects of the drug.

23
Q

What are covariantes

A

Covariates: In a study of the effects of a new educational program on student achievement, researchers might control for students’ socioeconomic status. This would help to ensure that the effects of socioeconomic status are not confounded with the effects of the educational program.

24
Q

Describe observational study

A

Cannot isolate casual drivers from effects of confounding variables
Studies in the wild

25
Q

Describe experimental study

A

can potentially isolate casual drivers from the effects of a confounding variable
Studies in the lab/controlled conditions

26
Q

Describe lurking variables

A

Variables that can make experiments uselss and their influence can be neutralized with good experimental design (Blocking, covariates)

Z -> Y & X
X -> Y
Where z = lurking variables (not interested in)
x = predictor variable
y = response variable

27
Q

What are pseudoreplicates?

A

Measurement units within experimental units
data points that are not independent of each other
This can happen when the same experimental unit is measured multiple times, or when multiple experimental units are nested within a larger unit. Pseudoreplication can lead to inflated Type I error rates, so it is important to be aware of it when designing and analyzing experiments.

28
Q

What is an eigenvalue

A

a number that tells you how much a vector is stretched or shrunk when it is multiplied by a matrix.
For example, let’s say we have a matrix that represents a rotation by 90 degrees. The eigenvalues of this matrix would be -1 and 1. This means that any vector that is multiplied by this matrix will be either rotated by 90 degrees or flipped upside down.

29
Q

Word equation:

A

response = predictor + error

30
Q

Formal statistical model

A

response.i = B0 + b1predictor + e.i

Where:
response.i is the mean number of response of individual i
B0 is the intercept, quantifies the mean number of response when predictor = 0
B1 partial regression slope of predictor
error.i is the residual variation in response for individual i