Statistics Flashcards

1
Q

P-value

A

probability of rejecting H0 with the value of the test statistic obtained from the data given H0 is true (small p value is preferred)
(if p value is small then it’s saying that the observation is unlikely to happen, so the data is statistically significant enough to reject the null hypothesis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Confidence interval

A

measures the degree of uncertainty or certainty in sampling method, and is the range of plausible values for an unknown parameter (finding test statistic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Odds ratio

A

describes the strength of the association between two events, ad/bc (the ratio of the odds of A in the presence of B and the ratio of the odds of A in the absence of B. p1/1-p1 / p2/1-p2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Z distribution

A

normal distribution when variance is known with a large sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

t distribution

A

normal distribution when variance is unknown with a small sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

power

A

probability of avoiding a type 2 error (when the type 2 error is to accept a false hypothesis), (1 - p(type 11 error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Type 1 error

A

h0 is rejected but h0 true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Type 2 error

A

h0 is accepted but h0 false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Unbiased

A

expected value of the estimator is the parameter (when the estimator is an estimate of the parameter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

MSE

A

variance(T) + bias(T)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

consistent

A

statistic tends to the parameter as n increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Method of moments

A

equating E(t) = mu and solving in terms of parameter, can also do for variance if sample has multiple parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

MLE

A

method for choosing the ‘best’ parameter which maximises the probability that the parameter produces the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sufficient

A

statistic is sufficient if it contains all the information about the parameter in it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Neyman-Fisher factorisation theorem

A

the likelihood can be factorised in terms of h(x) and g(t, theta) where t is a sufficient statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cram´er-Rao lower bound.

A

smallest the variance of any unbiased estimator can become is 1/I(θ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Invariance property

A

g is a 1-1 monotonic function and theta is an MLE then g(theta) is the MLE of g

18
Q

significance level/ size

A

p(type 1 error) = p(reject Ho| Ho true)

19
Q

Statistic to use when comparing two variances

A

F statistic (chi over chi)

20
Q

What to do if testing two means (non independent)

A

find difference between the two samples then use t test

21
Q

difference between classical and Bayesian

A

classical: theta would be an unknown fixed parameter and the likelihood is a function of the sample
Bayesian: assumes the parameter is a random variable which we assign prior beliefs onto

22
Q

model deviance

A

sum of the squared difference between true and estimate

23
Q

least squares

A

approximate a solution (parameters of the model) by minimising the squares of the residuals

24
Q

Gauss-Markov Theorem

A

m If βˆ is the least squares estimator of β, then aTβˆ

is the unique linear unbiased estimator of aTβ with minimum variance

25
Q

test for existence of regression

A

using F statistic (finding difference of model deviances etc)

26
Q

When to transform variables of a model

A

we need to transform the variables of a model (Y or X) if the residuals don’t look random

27
Q

total sum of squares

A

the total variability in y

28
Q

coefficient of determination (R^2)

A

proportion of variability explained by the regression (model)

29
Q

ANOVA

A

uses/ finds F to measure the existence of regression (where regression is strength of relationship)

30
Q

decision rule

A

formal rule which spells out the circumstances under which you would reject the null hypothesis

31
Q

Bernoulli distribution

A

two outcomes (success or failure), with probability p

32
Q

Binomial distribution

A

two outcomes (success or failure), repeated n times, probability p, x is number of successes

33
Q

Geometric distribution

A

two outcomes (success or failure), x is number of trials until a success occurs

34
Q

Poisson distribution

A

used to measure ‘frequency’ given a parameter (originally used to approximate the binomial distribution with large n and small p)

35
Q

chi distribution

A

normal distribution squared with 1 degree of freedom if not a sum, and n degrees of freedom is sum of n

36
Q

Central limit theorem

A

independent random variables are added, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed

37
Q

Confounding variable

A

say we want to show the effects of X on Y then C is a confounder if it has an effect on X and Y. So confounding is when the true effect of X on Y is hidden by another variable.

38
Q

Interaction

A

an interaction is when the effect of one variable on the outcome depends on another variable.

39
Q

What information would you need from a clinician in order to perform a sample size calculation?

A

We need the power, type 1 error rate, type 11 error rate, the clinically relevant difference the known variance sigma, and for two sample we need the ratio in each treatment group. The we approximate n via the Z distribution (normal) and iteration to find n based on the t distribution.

40
Q

Standard error

A

a measure of the statistical accuracy of an estimate, equal to the standard deviation of the theoretical distribution of a large population of such estimates.

41
Q

Standard deviation

A

a quantity expressing by how much the members of a group differ from the mean value for the group