Data Collection Statistics and Errors Flashcards
sample
subset of elements
population
all possible elements
It is important to choose a sample that is
random and representative.
Randomisation
prevents bias arising from a subjective choice of units.
Good data –
i.e. statistically valid data – should also be replicable.
Independent variable (factor):
The presumed cause in an experimental study
Dependent variable (response):
The presumed effect in an experimental study
Controlled variable:
An unrelated variable that an investigator does not wish to examine in a study.
Continuous variable:
A variable that is not restricted to particular values (other than limited by the accuracy of the measuring instrument), e.g. temperature, velocity, pressure…
Discrete variable:
A variable having only integer or ‘stepped’ values. For example, number of attempts to learn a task, number of days it rains in a month, number of rooms in a building…
data=
pattern + residuals
Pattern:
how two variables are (seemingly) related.
Residuals:
data points deviating from the proposed pattern.
Also called outliers
Do not ignore or remove outliers – they can contain valuable information
quoting errors
Only defined constants (e.g. 𝑐,𝜋,𝑒…) possess exact values (i.e. do not need to be quoted with errors)
For anything else, we need to quote errors!
experimental errors 4
Error’ is an indication of uncertainties in an experiment or measurement.
In effect, all you can determine in an experiment is an estimate.
Even an unbiased and consistent experiment will possess some error margins.
These errors, however, should become smaller with more repeats.
There are two main types of errors:
Systematic errors
e.g. if this guy’s tattoo artist was bad
Random errors
e.g. if he keeps flexing & extending his wrist while you measure
repetition leads to
different outcomes that have statistical regularity.
3 common causes of random error
Fluctuating ambient conditions (temperature, pressure etc.)
Human error.
An instrument is only as good as the person reading it!
Chaos!
systematic error 3
Systematic errors in general are typically caused by faulty (or badly calibrated) equipment or a badly designed experiment.
They can be dangerous as they cannot be measured by repeating the experiment with the same equipment.
As such, if we are to identify systematic errors, we need to have a good understanding of the nature of the experiment and the instruments involved.
standard deviation definition
provides a measure of how much, on average, the individual values in a data set of repeated measurements differ from the mean (i.e. the ‘assumed’ value).
standard deviation equation
𝜎_𝑥= √(1/𝑁 ∑_(𝑖=0)^𝑁▒(𝑥_𝑖−𝑥̅ )^2 )
𝑥̅ is the mean value of x for a given data set, meaning that (𝑥_𝑖−𝑥̅ ) gives the ‘individual deviation’ of each value from the mean.
The standard deviation is then simply the sum of the squared individual deviations.
quoting numbers sensibly
only keep significant figures up to the 1st digit of the error
standard error
𝑆𝐸=𝜎_𝑥̅ =𝜎_𝑥/√𝑁, where n is the number of observations or measurements.
What happens as N→∞?
Tends to 0
More measurements smaller error
error propagation addition and subtraction
Absolute error of result is simply the sum of the two original absolute errors.
error propogation Multiplication or division
and powers
look at slides