PART II: Python For Basic Data Analysis Flashcards
What are the functions you have to import to plot/show a graph?
From pylab (or matplotlib) and numpy
Plot, show, xlabel, title, legend, xlim, etc
-import data using loadtxt
-use linspace to get x values and then calc y
-assign x and y manually with lists
-can also use errorbar()
How can we modify the lines and colours on a plot?
Colours: r,g,b,c,m,y,k,w
Line styles: o(dotted), -(solid), –(dashed)
Scatter plots
Use scatter() from pylab For if you don't want to connect the dots
Density plots
- for chi^2 mostly
- use imshow() from pylab
- y axis top to bottom, x left to right
- adjust y with origin=”lower”
- parameter extent=[xl,xu,yl,yu] gives range of x and y values
- aspect=# specifies aspect ratio of x and y axes
- colorbar() shows range of colours to help read density plot
- also different colour schemes exist! (Ex. spectral() )
What kinds of errors or uncertainties are there?
Systematic: error only goes one way
Random: goes both ways at random
What is discrepancy?
Refers to the difference between results
When is discrepancy significant?
If it’s larger than both error ranges combined (as in, there’s no overlap of error bars)
When are the two measurements not consistent?
If the discrepancy is significant
Which means if it’s larger than both error ranges combined
Error propagation: sums, differences, products, quotients, how???
Sums and differences add in quadrature (sum of squares)
Products and quotients add RELATIVE error in quadrature
Follow BEDMAS
Error propagation: two correlated or dependent conditions
Condition with lowest error is considered in error calculations
What is probability?
The chances of getting a subset N outcomes from a total T possible outcomes:
P = N/T
Properties of probability?
- 0<p></p>
What do we do if we have two independent conditions? (Probability)
Neither can affect the probability of the other and so probabilities are multiplied
How do we represent statistical significance?
- we use n*sig (corresponds to probability of having a result n standard deviations away from mean in Gaussian dist)
- 1sig = 0.31
- 2sig = 0.0456
- outcomes are significant if p-value is equal to sig level
- assume two sides probabilities
Define p-value
Probability that our data matches the null hypothesis
Higher = greater match to “nothing happening”
Lower = more different and significant results compared to control/null hypothesis