Lecture 11: Statistics Flashcards

1
Q

Data analysis and error

A

> Data is analysed to separate the truth from the error
Error/uncertainty occurs from:
- Measurements – resolution error or calibration uncertainty
○ Reduce error by taking more accurate readings
- Sampling – reduce error by enlarging number studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of uncertainty- random

A

> scatter of measurements about a best value
From poor resolution, noise of equipment, fatigue
Cannot remove

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Types of uncertainty- Systematic

A

> from poor calibration or methodology mistake e.g. errors in equipment change depending on temperature
Gives constant error called bias
Can be removed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The 3 factors affecting error

A

> Precision
Accuracy
Reproducibility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Factors affecting error: Precision

A

> Precision is tendency to have values clustered closely together
- Significant figures
- Affected by ability to refine measurement e.g. weighing to 1g or 0.001g requires different balances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Factors affecting error: accuracy

A

> Accuracy is tendency to mimic “true value”
- Affected by systematic error e.g. contamination
- Not easily verified
- Agreement between methods?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Factors affecting error: reproducibility

A

> Reproducibility is “repeatability”
- Affected by random error
- Affects sensitivity/discrimination
- Estimated by replication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Measurement of uncertainty

A
  • Absolute uncertainty is actual magnitude of uncertainty
    • Is approximate value based on precision of measurements
      > Calculate the change in values, n is the number of values
      > Relative uncertainty is fraction or percentage of the measured value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Communicating uncertainty

A

> Quote an uncertainty rounded to 1 s.f. and then round the related measurement to this level of significance
Except for uncertainties beginning with a 1 where a further figure may be quoted
If no uncertainty given, implied uncertainty is next significant figure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to remove uncertainty/error

A

> Repeat measurements to form series
- Random errors cause numbers to cluster around the mean
Some values significantly deviate
- Called outliers
- Plot values on scatter plot
to show outliers
&raquo_space;Find those separate
from clustered values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Types of statistical distribution

A

> Normal (parametric) data
- Most continuous biological data is normally distributed
Non-normal (non-parametric) data
- Binomial
&raquo_space; Data in proportions or counts
&raquo_space; There are only 2 states
- Poisson
&raquo_space; Data is in counts
&raquo_space; Rare events or very large samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Frequency distribution

A

> Frequency = count the occurrences of each distinct outcome
For range, add frequencies together
Show in histogram
- Narrow spaces between columns for clarity
- Area of column equal to frequency
Column height is frequency density
Shows if data Is shar or board, symmetrical or skewed, single or bimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Frequency equation

A

frequency density=
frequency/width of frequency interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Normal distribution and frequency data

A

> Continuous quantitative data
- Length, height, weight etc.
- Plot frequency (y-axis) against variable (x-axis)
- Less data points at edges
Most data in middle around mean
Variables X and Y are related through mean and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Standard deviation and the normal distribution

A

> Approx 2/3rds of data lie within 1 SD of the mean
Approx. 95% of data lie within 2 SD of mean
Approx. 99% of data lie within 3 SD of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to test for normal distribution

A

> Check to see whether 2SD from mean is within possible range for variable

17
Q

Interpretive databases

A

> Forensic science often based on small sets of experimental data
Can use data from database of surveys or technical information from manufacturers
Compare your data to database

18
Q

Probability

A

> All outcomes equally likely
Count the number of outcomes
Probability is between 0 (outcome never occurs) and 1 (outcome always occurs)
Expressed as fraction, decimal or %

19
Q

Probability equation

A

Probability =
number of selected outcomes/
total number of possible outcomes

20
Q

Probability of specified outcomes

A

> Probability of outcome A written as P(A)
Probability of outcome B written as P(B)
P (A and B) = P(A) x P(B)
P(A or B) = P(A) + P(B)

21
Q

Why is probability important?

A

> Use it to calculate likelihoods of finding evidence
-Probability of evidence given guilt
-Probability of evidence given innocence
Ratio of these is called likelihood ratio
LR =
Probability of evidence given guilt/ Probability of evidence given innocence
High number suggests guilt
Low number suggests innocence