Data Analysis I Flashcards

1
Q

Give 3 factors that determine how much weight we should place on the results to a particular study.

A
  1. How well characterised are the reagents/equipment/methods?
  2. How well is the experiment designed - good controls, possible biases?
  3. How many times has the result been succesfully reproduced? (statistical confidence)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the main reason for calculating a mean of multiple observations?

A
  • Random errors tend to cancel each other out when a mean is taken
  • Any effect we observe is less likely to be random error and more likely to be real
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe 3 approaches used to assess the reliability of experimental results.

A
  1. Visual approach - s.e.m. error bars
  2. Numerical approach - p values and statistical significance
  3. Quantitative approach - confidence intervals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In clinical trials, what is the current preferred approach for assessing whether an effect seen in a study is real or due to chance?

A

Confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the meaning of the 0-1 p-value scale.

A
  • 1 means the data looks random
  • 0 means the data does not look random
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Briefly outline how we should interpret p-values of above and below 0.05.

A
  • P > 0.05 - result is statistically unreliable
  • P = 0.01-0.05 - effect is worth considering but may still be due to random chance
  • P = 0.01 or less - fairly convincing effect - it seems unlikely that this is due to random error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the disadvantage of using p-values to assess the significance of an effect?

A
  • Doesn’t tell you effect size
  • Doesn’t tell you confidence intervals

P-values should only be used in combination with visual data and/or confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Give functional definitions for “statistically significant” and “not statistically significant”.

A
  • Statistically significant - worth considering; more likely to be a real effect than random error
  • Not statistically significant - result is statistically unreliable; it may be real, but may well just be due to random error

Statistically significant does not mean biologically significant!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Biological significance is determined by what?

A

The size of the effect, i.e. the difference between the experiment and the control.

This is quantified by confidence intervals and can be seen in error bars.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give the advantages and disadvantages of using statistical significance.

A

Advantages:

  • Useful in quality control applications

Disadvantages:

  • Very misleading
  • Statistically signficant results may still be random noise - null hypothesis is correct
  • Effects that are not statistically significant may still be real
  • Statistical significance does not prove biological significance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which 2 tests are most commonly used to determine whether the difference between means is due to random error?

A

Student’s t-test, Mann Whitney U-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When is it appropriate to use the student’s t-test?

A

When the data is normally distributed. Slight variations from normal distribution are not a problem, but highly skewed data should be assessed by a different test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When should the Mann Whitney U-test be used?

A

If the data definitely isn’t normally distributed. This test is “non-parametric”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly