Model Fitting Flashcards

1
Q

{xi…xm}=

A

random sample from pdf p(x) with mean μ and variance σ^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

sample mean

A

μ hat = 1/m sum from i=1 to M of xi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

as sample size increases, sample mean

A

increasingly concentrated near to true mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

var(μ hat)=

A

σ^2/M

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

for any pdf with finite variance σ^2, as M approaches infinity, μ hat follows

A

a normal pdf with mean μ and variance σ^2/M

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

the central limit theorem exaplains

A

importance of normal pdf in statistics

but still based on asymptotic behaviour of an infinite ensemble of samples that we didn’t actually observe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

bivariate normal pdf

A

p(x,y) which is specified by μx, μy, σx, σy, p

often used in the physical sciences to model the joint pdf of two random variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

the first four parameters of the bivariate normal pdf are

A

equal to the following expectation values

E(x)=μx
E(y)=μy
var(x)=σx^2
var(y)=σy^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

the parameter p is known as the

A

correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what does the correlation coefficient satisfy?

A

E[(x-μx)(y-μy)]=pσxσy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

if p=0, then

A

x and y are independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is E[(x-μx)(y-μy)]=pσxσy also known as

A

the covariance of x and y and is often denoted cov(x,y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what does the covariance define?

A

how a parameter (x) varies with another parameter (y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

p>0

A

positive correlation
y tends to increase as x increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

p<0

A

negative correlation
y tends to decrease as x increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

contours become narrower and steeper as

A

|p| approaches 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is pearson’s product moment correlation coefficient

A

r

given sampled data, used to estimate the correlation between variables

18
Q

if p(x,y) is bivariate normal, then r is

A

an estimator of p

19
Q

the correlation coefficient is a unitless version of

A

the covariance

20
Q

if x and y are independent variables, cov(x,y)=

A

0

so p(x,y)=p(x)p(y)

21
Q

the method of least squares

A

workhorse method for fitting lines and curves to data in the physical sciences

useful demonstration of underlying statistical principles

22
Q

ordinary least squares

A

scatter plot of (x,y) is assumed to arise from errors in only one of the two variables

23
Q

ordinary least squares - can write

A

yi=a+bxi+Ɛi

24
Q

what is Ɛi

A

the residual of the ith data point

i.e the difference between the observed value of yi and the value predicted by the best fit, characterised by parameters a and b

25
Q

we assume that the Ɛi are

A

an independently and identically distributed random sample from some underlying probability distribution function with mean zero and variance σ^2

(residuals are equally likely to be positive or negative and all have equal variance)

26
Q

ds/da=0 when

A

a=a_LS

27
Q

Weighted least squares is an efficient method that makes good
use of

A

small data sets

28
Q

weighted least squares - in the case where σi^2 is constant for all i, the formulae

A

reduce to those for the unweighted case

29
Q

principle of maximum likelihood is a method to

A

estimate the parameters of a distribution which fit to observed data

30
Q

principle of maximum likelihood - first

A

decide which model we think best describes the process of
generating the data.

31
Q

Maximum likelihood estimation is a method that will find the values

A

of mu and sigma that result in the curve that best fits the data

32
Q

Assuming all events are independent, then the total probability of observing all of data is

A

the product of observing each data point individually (i.e. the product of the individual probabilities)

33
Q

when is chi2 used?

A

when we know there are definite outcomes e.g. flipping a coin, measure whether email arrival rate
is constant in time => no errors on measurement

34
Q

when is reduced chi2 used?

A

when we know there is uncertainty
or variance in a measured quantity e.g. measure flux from a galaxy => errors on measurement

35
Q

poisson distribution, k=

A

1 (mean)

36
Q

normal distrubution, k=

A

2 (mean and variance)

37
Q

degrees of freedom=

A

N-K-1

38
Q

For the reduced Chi2, don’t know number of outcomes, so degrees of freedom are

A

the number of data points

39
Q

p-value

A

If the null hypothesis were true, how probable is it that we would measure as large, or larger, a value of chi2 ?

40
Q

standard value to reject a hypothesis

A

a p-value <0.05

41
Q

If we obtain a very small P-value (e.g. a few percent?) we can interpret this as

A

providing little support for the null hypothesis, which we may then choose to reject.

42
Q
A