Stats Flashcards

1
Q

z-test

A

variance is known
(y-mu)/(sigma/sqrt(n))
(y1-y2)/sqrt(sigma2/n1+sigma2/n2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

t-test

A

variance is not known

y1-y2)/(s/sqrt(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

CLT

A

zn = (x-nmu)/(nsigma2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

95 percentile

A

y +/- z(SE Mean)

SE Mean = s/sqrt(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ANOVA

A

SS(treat), df = a-1, MS, F
SS(E), df = N-a, MS, F
SS(T), df = N-1, MS, F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

a in ANOVA

A

of treatments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

n in ANOVA

A

of blocks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

i in ANOVA

A

treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

j in ANOVA

A

block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

residual

A

yij - average(yi)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

3 model adequacy checking graphs

A

(1) normal prob plot
(2) predicted values plot
(3) time series plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

normal prob plot

A

catches outliers, need to transform
x = residual
y = normal % probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

predicted values plot

A

tests homogeneity; control by control, randomize, transform
x = predicted yi
y = residual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

time series plot

A

tests independence
x = run order time
y = response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

tests for equality of variance

A

(1) bartletts

(2) modified levines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Box Cox

A

selects transform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Contrasts

A

(1) orthogonal

(2) scheffe - don’t need to specify in advance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Comparing Means

A

(1) Fischer LSD - does not use overall error rate
(2) Tukey’s test - uses overall error rate
(3) Dunnett’s test - when you have a control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Determining sample size

A

(1) operating characteristics of curves

(2) specifying std dev

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Random Effects Model

A

Randomly selects levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Random Control Block Design

A
  • blocks represent a restriction on randomization

- control of nuisance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

SS(treat)

A

(1/b) sum(yi2 - y2/N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

SS(block)

A

(1/a) sum(yj - y2/N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

SS(E)

A

SS(T) - other SS’s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

SS(T)

A

sum(yij2 - y2/N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

df for RCBD

A
SS(Treat) = a - 1
SS(blocks) = b-1
SS(E) = (a-1)(b-1)
SS(T) = N-1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Latin Square

A
  • blocking in 2 directions
  • 2 restrictions on randomization
  • disadvantage - small DF, control by replicating operators
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Latin Square setup

A
SS(Treatments), df = p-1
SS(Rows), df = p-1
SS(columns), df = p-1
SS(E), df = (p-1)(p-2)
SS(T), df = p2-1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Crossover

A
  • eliminate issue of time

- may still have residual effect (mixing of results)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Graeco Latin Square

A

blocks in 3 directions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Main effect

A

sum(A+)/2 - sum(A-)/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Interaction

A

diff(A’s at B+)/2 - diff(A’s at B-)/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

SS(A)

A

1/bn(sum(yi2 - y2/abn)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

SS(int)

A

1/n(sum(yij2 - y2/abn) - SS(A) - SS(B)

35
Q

df for factorial design

A
A = a-1
B = b-1
error = ab(n - 1)
T = abn - 1
36
Q

SS(blocks)

A

1/(ab) sum(yk2 - y2/abn)

37
Q

SS(A) for factorial

A

[a + ab - b - (1)]^2/4n
n is number of replicates
4 represents 2^2, would be 8 for 2^3

38
Q

SS(T) df for factorial

A

4n - 1

39
Q

Main effect for factorial

A

A = 1/2n [a + ab - b - (1)]

2 represents 2^2, would be 4 for 2^3

40
Q

Coefficient for regression

A

SS/2

41
Q

R^2

A

ss(model)/SS(Total)

42
Q

Orthogonality

A

(1) = number of + and -
(2) sum of elements in column = 0
(3) I * col -> unchanged
(4) products of any 2 columns yields a column already on table

43
Q

VIF

A

1/(1-R^2)

44
Q

Types of error

A
  • standard error (for regression coefficient)
  • pure (from replication)
  • lack of fit (from pooling)
  • residual (PE + LOF)
45
Q

Dispersion effect

A

look at ranges

46
Q

Half normal

A

plot of coefficients

47
Q

Defining relation

A

I = …

48
Q

Design generator

A

A = BC (aliasing)

49
Q

Resolution

A

Shortest word in a defining relation

50
Q

Family

A

I = +/- ABC

51
Q

Confirmation Experiment

A

Set factors at levels and compare -> regression model

52
Q

Choosing a design

A

highest resolution

53
Q

Number of treatment combinations

A

2^(5-2) = 8

54
Q

Folding

A

change signs for all factors, odd become negative

55
Q

Combined defining relation

A

multiply - words, copy + words

56
Q

Aliases

A

1/2([i] + [i]’)

57
Q

Plackett Burman

A

different class of III design

  • needs to be a multiple of 4
  • non-regular
  • non-geometric
  • not flexible - cannot be represented by cubes
58
Q

Super saturated

A

P-B and sort on last row, delete all - or +

- k>N-1

59
Q

k

A

number of factors

60
Q

Treatment design

A
  • know how design is confounded
  • prevent nuisance variables
  • signal what we know and don’t know
61
Q

Experimental design

A
  • Randomize to prevent bias

- Figure out execution

62
Q

Estimate correct alias

A
  • prior knowledge of system
  • interaction plot
  • p-values for each individually
  • run other half
63
Q

Empirical vs Mechanistic

A

derived vs. theoretical law

64
Q

Regression

A

no statement of effect, not causal

65
Q

Missing data point

A

Slightly different regression

66
Q

Standard dev versus Confidence Interval

A

Variability in raw data versus variability in means

67
Q

Prediction interval

A

CI around confirmation run

68
Q

Lack of fit

A

how well points fit regression

69
Q

2 error terms for regression

A

pure, lack of fit

70
Q

Response Surface Methodology

A

sequential process, method/path of steepest ascent

71
Q

Procedure for method of steepest ascent/descent

A

(1) 1st order model
(2) check error, interactions, quadratic effects (curvature)
(3) Ax1 = 1; x2 = something
(4) x = something
(5) test with new factor levels and keep stepping
(6) perform new factorial with region of exploration centered around optimal points

72
Q

Why use center point?

A
  • help check if don’t want to replicate
  • check for curvature
  • add df for error
73
Q

Central composite design

A
  • n(f) factorial runs, n(c) centerpoint runs, 2k axial
74
Q

Sequential central composite design

A

(1) 1st order -> lack of fit

(2) introduce axial points to allow quadratic terms

75
Q

Rotatable CCD

A
  • indicates good model

- similar variances for points of interest when rotated

76
Q

Box-Behnkin

A
  • one factor is always at the center
  • all points equidistant from center point, leads to = var
  • spherical, no points at vertices
77
Q

If you need a “-“ value for time

A
  • don’t collect, missing value
  • change other factor -> shift design
  • constrained region - D-optimal
  • inscribed CCD (inside of box)
  • face-centered->replace corner with face points
78
Q

Evolutionary operation

A
  • constant monitoring and improving
  • slight changes
  • more data to find smaller differences\
  • longer period of time, lurking
79
Q

Mixture

A
  • factor levels not independent
  • lattice simplex
  • centroid simplex
80
Q

Lattice Simplex

A

{p, m}
p = components of mixture (sugar, cream)
m = all positive combinations of mixture (sugar = 0, 1/3, 2/3, 1)
p = 3 means 2D, m = 2 means 3 points on edge

81
Q

Centroid simplex

A

2^p - 1 runs

82
Q

Lattice vs. centroid

A

lattice is more flexible than centroid

83
Q

Axial blends

A

axial points in the interior

84
Q

Model Adequacy

A

checked 2nd time around