Exam 1 Lecture 2 Flashcards
what does statistics give us tools to do?
- accept conclusions that have a high probability of being correct
- reject conclusions that have a high probability of being incorrect
Gaussian Distribution (AKA normal distribution)
if an experiment is repeated a great many times and if the errors are purely random:
- the results tend to cluster symmetrically about the average value
- the more times the experiment is repeated, the more closely the results approach a Gaussian distribution
the smaller the standard deviation, s, the more closely the data…?
are clustered about the mean (high precision)
greater precision does not necessarily imply greater accuracy!!
standard deviation (s)
measures how closely data are clustered about the mean
relative standard deviation (RSD)
standard deviation expressed as a percentage of the mean
variance
s^2
in a gaussian curve, the sum of the probabilities of all measurements must be what?
in unity; the probability of observing a rule within a certain interval is proportional to the area of that interval
the area under the whole curve from z= negative infinity to positive infinity adds up to 1.
what does the standard deviation measure (Gaussian curve)
the width of the Gaussian curve
the large the sd, the broader the curve
the more times a quantity is measured…
the more confident you can be that the mean is close to the population mean
- uncertainty decreases in proportion to 1/srt(n)
- u=s/srt(n) where u measures the uncertainty in the mean (x), u reaches 0 as n approaches infinity and s measures uncertainty in x, s approaches constant value as n approaches infinity
- you can decrease the uncertainty by a factor of 2 by making 4 times as many measurements
F-test
used to compare two variances (s^2 values)
- Fcalculated = s^2/s^2 where the first variance on top is greater than the variance on the bottom
- “are the mean values of two sets of measurements statistically different from each other when experimental uncertainty is considered?”
null hypothesis
states that two sets of data are drawn from populations with the same properties
- observed differences arise only from random variation in measurements
- reject the null hypothesis if there is less than 5% probability of observing experimental results from two populations with the same value
when should you reject the null hypothesis (F-test)
if Fcalculated>Ftable
- there is <5% chance that the two data sets came from populations with the same population standard deviation
- the difference is considered significant
Student’s t
statistical tool used to find confidence intervals and is also used to compare mean values measured by different methods; used to compare results from different experiments
confidence intervals
confidence interval= x+- ts/srt(n)
- if we were to repeat n measurements many times, the 95% confidence interval would include the true population mean (whose value we do not know) in 95% of the sets of n measurements
t Test (comparison of means)
determines if there is a statistical difference between x1 and x2
- if you make two sets of measurements of the same quantity, generally x1 does not equal x2 due to random variations in measurements
- “Are the means of two sets of measurements statistically different?”
when should you reject the null hypothesis (t Test)?
if tcalculated > ttable
- there is a <5% chance that the two data sets came from populations with the same population mean; the difference is considered significant
Grubbs Test for Outliers
if Gcalculated > Gtable, the questionable point should be discarded; only one outlier may be rejected using the Grubbs test
Gcalculated= (questionable value - x)/ standard error