Final Flashcards

1
Q

The probability of either event A OR event B occuring if they are not mutually exclusive

A

Pr(A or B) = Pr(A) + Pr(B) - Pr(A and B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bayes theorem

A

Pr(A I B) = Pr(B I A) Pr(A) / Pr(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Steps of hypothesis testing

A

1) state hypothesis
2) compute test statistic
3) determine p-value
4) draw appropriate conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

significance level

A
  • greek letter alpha
  • commonly 0.05 in biology
  • probability used as a critereon for rejecting the null hypothesis
  • p-value less than or equal to alpha, reject null hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

type I error

A
  • rejecting a true null hypothesis

- determined by significance level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

type II error

A
  • failing to reject a false null hypothesis

- low type II error = high power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

power

A

probability that a random sample will lead to rejection of a false null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

non-significant result

A

failing to reject the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

assumptions of chi squared test

A
  • none of the categories have an expected frequency of less than 1
  • No more than 20% of categories have an expected frequency of less than 5
  • random sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

poisson distribution

A

describes the number of successes in blocks of time or space, when successes happen independently of each other and with equal probability at every instant in time or point in space

  • random (meets criteria for distribution)
  • clumped or dispersed (do not meet critereon for distribution)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

the odds ratio

A
  • measures the magnitude of association between two categorical variables when each has only two categories
  • one variable is response, other is explanatory (whose odds of success is being compared)
  • odds(o) = probability of success / probability of failure
  • odds ratio (o1/o2) = odds of success in one group divided by odds of success in a second group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

relative risk

A
  • another commonly used measure of the association between two categorical variables when both have just two categories
  • RR = probability of undesired outcome in treatment group / probability of undesired outcome in control group
  • will be relatively similar to the odds ratio when focal outcome is rare
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

positives vs. negatives for correlation coefficient

A
  • sum will be positive if: most of the observations are in the lower left or upper right
  • sum will be negative if: most observations lie in the upper left and lower right corners
  • sum will be close to zero: if the scatter of observations fill all corners of the plane
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

assumptions of the correlation coefficient

A
  • data has bivariate normal distribution
  • relationship between x and y is linear
  • frequency of distributions of x and y separately are normal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

departures from bivariate normality

A
  • funnel
  • outlier
  • non-linear
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

measurement error

A

the difference between the true value of a variable for an individual and its measured value

  • error in either x or y tends to weaken correlation between variables
  • same thing happens with uncorrelated error in both x and y
  • w/ measurement error, r will tend to underestimate p (closer to zero than true correlation, a bias called attenuation)
17
Q

regression

A

predicts the value of one numerical variable from values of another numerical variable
-fits a line through data to figure out how steeply one variable changes with another

18
Q

least-squares regression line

A

the line for which the sum of all squared deviations in Y is the smallest

19
Q

residuals

A

measure the scatter of points above and below the least-squares regression line
-crucial for evaluating the fit of the line to the data

20
Q

deterministic vs. stochastic models

A
  • deterministic: no randomness is involved –> will always produce the same output given the initial state
  • stochastic models: non-deterministic –> subsequent states of the system determined probabistically
21
Q

markov property

A

memorylessness

22
Q

types of markov chains

A
  • absorbing –> has at least one absorbing state and from every state it is possible to reach the absorbing state (does not have to be in 1 step)
  • transient
23
Q

ergodic (irreducible markov chains)

A

-it is possible to go from every state to every other state (not necessarily in one move)

24
Q

regular markov chains

A

if some power of the transition matrix has only positive elements (they converge)

25
Q

p-value

A

probability of getting the data, or something as or more unusual, if the null hypothesis were true

26
Q

larger samples give more information because…

A
  • more power

- tend to estimate parameter with smaller confidence interval

27
Q

transformations

A

Y = aX^b –> ln(Y) = ln(a) + bln(X) –> power function
Y = ab^X –> ln(Y) = ln(a) + Xln(b) –> exponential function
-transformed data has a better residual plot
-try transformations to see if it brings the outlier closer to the rest of the distribution

28
Q

________ are a subset of __________

-ergodic and regular markov chains

A

ergodic markov chains are a subset of regular markov chains

29
Q

regression

A
  • predicts x from y
  • provides the rate of change
  • assumes relationship between x and y can be described by a line
30
Q

correlation vs. regression

A
  • correlation measures the strength of the association

- regression measures the rate of change

31
Q

residual plots assumptions

A
  • a roughly symmetric cloud of points above and below the line y = 0
  • little noticeable curvature as we move along x-axis
  • approximately equal variance of points above and below the line at all values of x