Basic Statistical Toolbox Flashcards
mathematical statistics
concerned with theoretical foundations (probability theory)
applied statistics
concerned with modelling of data and the errors in our observations to make inferences about the physical system we are observing
statistical error
Uncertainty in the measurement of a physical quantity that is essentially unpredictable – just
as likely to yield a measurement that is too large as one that is too small.
‘common sense principle’
if we repeat our measurements many times and average the results then average length = ‘true’ length
systematic error
Uncertainty in the measurement of a physical quantity that is always systematically too large
or too small. (Measurement is biased)
Note that here the systematic error enters not when we make our measurements, but
when we analyse them
Systematic flaws in our data analysis methods, rather than in
our data themselves, are just as serious but
easier to fix
plausible reasoning
probability measures our degree of belief that something is true
prob(X)=1
certain that X is true
prob(X)=0
certain that X is false
In astronomy we generally measure continuous variables which can
take on
infinitely many possible values
with infinitely many possible values, p(X) is no longer a probability but a
probability density function
probabilities are never
negative
p(x)>or=0 for all x
we compute probabilities by
measuring the area under the pdf curve
ie integral between a and b of p(x) dx
normalisation
integral between -infinity and infinity =1
important pdfs:
- poisson
- uniform
- central/normal/gaussian
examples of poisson pdfs
number of photons counted by CCD
number of galaxies counted by galaxy survey
photons in laser beam
poisson pdf assumes
detections are independent and there is a constant rate μ
uniform pdf
p(x)=1/b-a when a<x<b
o otherwise
what is a cdf
cumulative distribution function
cdf P(a)=
integral from -infinity to a of p(x)dx = prob(x<a)
the nth moment of a pdf is defined as (discrete case)
sum from x=a to x= of x^np(x)delta x
nth moment of a pdf for the continuous case
integral from a to b of x^np(x)dx
1st moment is called the
mean or expectation value
second moment is called
the mean square
the variance for the discrete case is defined as
sum between a and b of p(x)(x-<x>)^2delta x</x>
the variance for the continuous case
integral between a and b of p(x)(x-<x>)^2dx</x>
the variance is often written as
σ^2
σ=sqrt(σ^2) is called
the standard deviation
in general, var[x]=
<x^2>-<x>^2</x>
the median divides the CDF into
two equal halves
prob(x<xmed)=
prob(x>xmed)=0.5
the mode is the values of x for which
the pdf is a maximum
for a normal pdf, mean=
median=mode=μ