Lab 8 Flashcards

1
Q

There are five components to an experiment:

A

hypothesis, experimental design, experimental execution, statistical analysis, and interpretation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

experimental design

A

By experimental design is meant only “the logical structure of the experiment”

A full description of the objectives of an experiment should specify the nature of the experimental units to be employed, the number and kinds of treatments (including “control” treatments) to be imposed, and the properties or responses (of the experimental units) that will be measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

manner in which treatments are assigned

A

Once these have been decided upon, the design of an experiment specifies the manner in which treatments are assigned to the available experimental units, the number of experimental units (replicates) receiving each treatment, the physical arrangement of the experimental units, and often, the temporal sequence in which treatments are applied to and measurements made on the different experimental units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

successful execution

A

requires that the experimenter avoid introducing systematic error (bias) and minimize random error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In experimental work,

A

the primary function of statistics is to increase the clarity, conciseness, and objectivity with which results are presented and interpreted.

Statistical analysis and interpretation are the least critical aspects of experimentation, in that if purely statistical or interpretative errors are made, the data can be reanalyzed. On the other hand, the only complete remedy for design or execution errors is repetition of the experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Two classes of experiments may be distinguished:

A

mensurative and manipulative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Mensurative experiments

A

involve only the making of measurements at one or more points in space or time; space or time is the only “experimental” variable or “treatment.”

Tests of significance may or may not be called for.

Usually do not involve the imposition by the experimenter of some external factor(s) on experimental units.

If they do involve such an imposition, (e.g., comparison of the responses of high- elevation vs. low- elevation oak trees to experimental defoliation), all experimental units are “treated” identically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Example 1. We wish to determine how quickly maple (Acer) leaves decompose when on a lake bottom in 1 m of water.

So we make eight small bags of nylon netting, fill each with maple leaves, and place them in a group at a spot on the l-m isobath.

After 1 mo we retrieve the bags, determine the amount of organic matter lost (“decomposed”) from each, and calculate a mean decomposition rate.

A

This procedure is satisfactory as far as it goes. However, it yields no information on how the rate might vary from one point to another along the 1-m isobath; the mean rate we have calculated from our eight leaf bags is a tenuous basis for making generalizations about “the decomposition rate on the 1- m isobath of the lake.”

Such a procedure is usually termed an experiment simply because the measurement procedure is somewhat elaborate, often involving intervention in or prodding of the system.

If we had taken eight temperature measurements or eight dredge samples for invertebrates, few persons would consider those procedures and their results to be “experimental” in any way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Example 2. We wish, using the basic procedure of Example 1, to test whether the decomposition rate of maple leaves differs between the 1-m and the 10- m isobaths.

So we set eight leaf bags on the 1- m isobath and another eight bags on the 10- m isobath, wait a month, retrieve them, and obtain our data.

Then we apply a statistical test (e.g., t test or U test) to see whether there is a significant difference between decomposition rates at the two locations.

A

We can call this a comparative mensurative experiment. Though we use two isobaths (or “treatments”) and a significance test, we still have not performed a true or manipulative experiment.

We are simply measuring a property of the system at two points within it and asking whether there is a real difference (“treatment effect”) between them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

using ex 1 and 2 to make a proper mensurative experiment.

A

To achieve our vaguely worded purpose in Example 1, perhaps any sort of distribution of the eight bags on the 1- m isobath was sufficient.

In Example 2, however, we have indicated our goal to be a comparison of the two isobaths with respect to decomposition rate of maple leaves.

Thus we cannot place our bags at a single location on each isobath.

That would not give us any information on variability in decomposition rate from one point to another along each isobath. We require such information before we can validly apply inferential statistics to test our null hypothesis that the rate will be the same on the two isobaths.

So on each isobath we must disperse our leafbags in some suitable fashion.

There are many ways we could do this. Locations along each isobath ideally should be picked at random, but bags could be placed individually (eight locations), in groups of two each (four locations), or in groups of four each (two locations).

Furthermore, we might decide that it was sufficient to work only with the isobaths along one side of the lake, etc.

Assuring that the replicate samples or measurements are dispersed in space (or time) in a manner appropriate to the specific hypothesis being tested is the most critical aspect of the design of a mensurative experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Example 3. Out of laziness, we place all eight bags at a single spot on each isobath.

It will still be legitimate to apply a significance test to the resultant data.

However, and the point is the central one of this essay, if a significant difference is detected, this constitutes evidence only for a difference between two (point) locations ; one “happens to be” a spot on the 1- m isobath, and the second “happens to be” a spot on the 10- m isobath.

A

Such a significant difference cannot legitimately be interpreted as demonstrating a difference between the two isobaths , i.e., as evidence of a “treatment effect.”

For all we know, such an observed significant difference is no greater than we would have found if the two sets of eight bags had been placed at two locations on the same isobath.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Pseudoreplication

A

If we insist on interpreting a significant difference in Example 3 as a “treatment effect” or real difference between isobaths, then we are committing what I term pseudoreplication.

Pseudoreplication may be defined, in analysis of variance terminology, as the testing for treatment effects with an error term inappropriate to the hypothesis being considered.

In Example 3 an error term based on eight bags at one location was inappropriate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Pseudoreplication in mensurative experiments

A

In mensurative experiments generally, pseudoreplication is often a consequence of the actual physical space over which samples are taken or measurements made being smaller or more restricted than the inference space implicit in the hypothesis being tested.

In manipulative experiments, pseudoreplication most commonly results from use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated (though samples may be) or replicates are not statistically independent.

Pseudoreplication thus refers not to a problem in experimental design (or sampling) per se but rather to a particular combination of experimental design (or sampling) and statistical analysis which is inappropriate for testing the hypothesis of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

MANIPULATIVE EXPERIMENTS

A

Whereas a mensurative experiment may consist of a single treatment (Example 1), a manipulative experiment always involves two or more treatments, and has as its goal the making of one or more comparisons.

The defining feature of a manipulative experiment is that the different experimental units receive different treatments and that the assignment of treatments to experimental units is or can be randomized.

Note that in Example 2 the experimental units are not the bags of leaves, which are more accurately regarded only as measuring instruments, but rather the eight physical locations where the bags are placed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Critical features of a controlled experiment

A

Manipulative experimentation is subject to several classes of potential problems.

listed these as “sources of confusion”; an experiment is successful to the extent that these factors are prevented from rendering its results inconclusive or ambiguous.

It is the task of experimental design to reduce or eliminate the influence of those sources numbered 1 through 6.

For each potential source there are listed the one or more features of experimental design that will accomplish this reduction.

Most of these features are obligatory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Refinements in the execution of an experiment may further reduce these sources of confusion.

A

However, such refinements cannot substitute for the critical features of experimental design: controls, replication, randomization, and interspersion.

One can always assume that certain sources of confusion are not operative and simplify experimental design and procedures accordingly.

This saves much work.

However, the essence of a controlled experiment is that the validity of its conclusions is not contingent on the concordance of such assumptions with reality.

Against the last source of confusion listed (Table 1), experimental design can offer no defense.

17
Q

table 1 I. potential source’s of confusion in an experiment and means for minimizing their effect

A
  1. temporal change
    - control treatments
  2. procedure effects
    - control treatments
  3. experimenter bias
    -randomized assignment of experimental units to treatments
    - randomization in conduct of the other procedures “blind” proceduresusually employed only where measurements involves a large subjective element
  4. experimenter-generated variability (random error)
    - replication of treatments
  5. initial or inherent variability among experimental units
    - replication of treatments
    - interspersion of treatments
    - concomitant observations
  6. nondemonic intrusion
    - replication of treatments
    - interspersion of treatments
  7. demonic intrusion
    - eternal vigilance, exorcism, human sacrifices, etc
18
Q

Controls.

A

is another of those unfortunate terms having several meanings even within the context of experimental design.

In Table 1, I use control in the most conventional sense, i.e., any treatment against which one or more other treatments is to be compared.

It may be an “untreated” treatment (no imposition of an experimental variable), a “procedural” treatment (as when mice injected with saline solution are used as controls for mice injected with saline solution plus a drug), or simply a different treatment.

At least in experimentation with biological systems, controls are required primarily because biological systems exhibit temporal change.

If we could be absolutely certain that a given system would be constant in its properties, over time, in the absence of an experimentally imposed treatment, then a separate control treatment would be unnecessary.

Measurements on an experimental unit prior to treatment could serve as controls for measurements on the experimental unit following treatment.

19
Q

In many kinds of experiments, control treatments have a second function:

A

to allow separation of the effects of different aspects of the experimental procedure.

Thus, in the mouse example above, the “saline solution only” treatment would seem to be an obligatory control.

Additional controls, such as “needle insertion only” and “no treatment” may be useful in some circumstances.

20
Q

A broader and perhaps more useful (though less conventional) definition of “control” would

A

include all the obligatory design features listed beside “Sources of confusion” numbers 1- 6 (Table 1). “Controls” (sensu stricto) control for temporal change and procedure effects.

21
Q

Randomization controls

A

Randomization controls for (i.e., reduces or eliminates) potential experimenter bias in the assignment of experimental units to treatments and in the carrying out of other procedures.

22
Q

Replication controls

A

Replication controls for the stochastic factor, i.e., among-replicates variability inherent in the experimental material or introduced by the experimenter or arising from nondemonic intrusion.

23
Q

Interspersion controls

A

for regular spatial variation in properties of the experimental units, whether this represents an initial condition or a consequence of nondemonic intrusion.

In this context it seems perfectly accurate to state that, for example, an experiment lacking replication is also an uncontrolled experiment; it is not controlled for the stochastic factor.

The custom of referring to replication and control as separate aspects of experimental design is so well established, however, that “control” will be used hereafter only in this narrower, conventional sense.

24
Q

A third meaning of control in experimental contexts is

A

regulation of the conditions under which the experiment is conducted.

It may refer to the homogeneity of experimental units, to the precision of particular treatment procedures, or, most often, to the regulation of the physical environment in which the experiment is conducted.

25
Q

Thus some investigators would speak of an experiment conducted with inbred white mice in the laboratory at 25° ± 1 oc as being “better controlled” or “more highly controlled” than an experiment conducted with wild mice in a field where temperature fluctuated between 15° and 30°.

A

This is unfortunate usage, for the adequacy of the true controls (i.e., control treatments) in an experiment is independent of the degree to which the physical conditions are restricted or regulated.

Nor is the validity of the experiment affected by such regulation.

Nor are the results of statistical analysis modified by it; if there are no design or statistical errors, the confidence with which we can reject the null hypothesis is indicated by the value of P alone.

These facts are little understood by many laboratory scientists.

26
Q

“Hold constant all variables except the one of interest.”

A

This third meaning of control undoubtedly derives in part from misinterpretation of the ancient but ambiguous dictum, “Hold constant all variables except the one of interest.”

This refers not to temporal constancy, which is of no general value, but only to the desired identity of experimental and control systems in all respects except the treatment variable and its effects.

27
Q

Replication, randomization, and independence.

A

Replication and randomization both have two functions in an experiment: they improve estimation and they permit testing. Only their roles in estimation are implied in Table 1.

With respect to testing, the “main purpose [of replication], which there is no alternative method of achieving, is to supply an estimate of error [i.e., variability] by which the significance of these comparisons is to be judged … [and] the purpose of randomization … is to guarantee the validity of the test of significance, this test being based on an estimate of error made possible by replication”

28
Q

Replication

A

reduces the effects of “noise” or random variation or error, thereby increasing the precision of an estimate of, e.g., the mean of a treatment or the difference between two treatments.

29
Q

Randomization

A

eliminates possible bias on the part of the experimenter, thereby increasing the accuracy of such estimates.

30
Q

In exactly what way does randomized assignment of treatments to experimental units confer “validity” on an experiment?

A

A clear, concise answer is not frequently found. It guarantees “much more than merely that the experiment is unbiased “though that is important. It guarantees that, on the average, “errors” are independently distributed, that “pairs of plots treated alike are* not nearer together or further apart than, or in any other relevant way distinguishable from pairs of plots treated differently “except insofar as there is a treatment effect (Fisher 1926:506). (*In her paraphrase of this statement, Box [ 1978: 146] inserts at this point the very important qualifier, “on the average.”)

31
Q

In operational terms, a lack of independence of errors

A

prohibits us from knowing α , the probability of a type I error. In going through the mechanics of a significance test, we may specify, for example, that α = 0.05 and look up the corresponding critical value of the appropriate test criterion (e.g., tor F).

However, if errors are not independent, then true α is probably higher or lower than 0.05, but in any case unknown. Thus interpretation of the statistical analysis becomes rather subjective.

32
Q

Demonic intrusion.

A

If you worked in areas inhabited by demons you would be in trouble regardless of the perfection of your experimental designs.

If a demon chose to “do something” to each experimental unit in treatment A but to no experimental unit in treatment B, and if his/ her/ its visit went undetected, the results would be misleading.

One might also classify the consequences of certain design or execution errors as demonic intrusion.

For example, if effects of fox predation are studied using fenced and unfenced fields, hawks may be attracted to the fence posts and use them as perches from which to search for prey.

Later, foxes may get credit for treatment effects generated in the fenced fields by the hawks.

Whether such non- malevolent entities are regarded as demons or whether one simply attributes the problem to the experimenter’s lack of foresight and the inadequacy of procedural controls is a subjective matter.

It will depend on whether we believe that a reasonably thoughtful experimenter should have been able to foresee the intrusion and taken steps to forestall it.

33
Q

nondemonic intrusion

A

By nondemonic intrusion is meant the impingement of chance events on an experiment in progress.

This sort of intrusion occurs in all experimental work, adding to the “noise” in the data.

Most of the time the effect of any single chance event is immeasurably slight.

However, by definition, the nature, magnitude, and frequency of such chance events are not predictable, nor are their effects.

If an event impinges on all experimental units of all treatments there is no problem.

Every change in weather during a field experiment would represent such a “chance” event.

34
Q

Potentially more troublesome are chance events that affect only one or a few experimental units.

A

An experimental animal may die, a contamination event may occur or a heating system may malfunction.

Some chance events may be detected, but most will not be. Experimenters usually strive to minimize the occurrence of chance events because they reduce the power of an experiment to detect real treatment effects.

However, it is also important to minimize the probability of concluding there is a treatment effect when there is not one.

Replication and interspersion of treatments provide the best insurance against chance events producing such spurious treatment effects (Table 1 ).

35
Q

Excerpt from Mangusson 1997:

A

[Concepts related to statistics that apply to your proposal:]

1) It is not possible to prove something; you can only disprove it.

2) You cannot test a hypothesis with the data that were used to erect it.

3) The observations used in statistical tests must be independent in relation to the question.

4) Sources of pseudoreplication (observations not independent) may be spatial, temporal, or phylogenetic.

9) The number of samples necessary and how they should be distributed is a biological question, and the best method to evaluate the number of independent observations is to use hypothetical graphs.

10) If no information about variability in the variables to be measured is available, the following guidelines usually work for ecological data.

To estimate the number of independent observations necessary when all of the variables are continuous (multiple regression), multiply the number of independent variables by 10.

When all of the variables are categorical (ANOVA), multiply the number of levels in the factors and multiply the resultant by 4.

If you have a mixture of categorical and continuous variables (ANCOVA), sum the number of levels in the categorical variables and multiply the resultant by 10.

36
Q

Trophic Cascades in Salt Marsh Ecosystems

A

ecologist Brian Silliman explains how he uses manipulative field experiments to study salt marsh ecosystems.

His approach revealed that these systems are under top-down control from consumers and predators.

Salt marshes were once considered examples of bottom-up regulation in which population sizes are determined by abiotic factors and nutrient availability.

Silliman observed that salt marsh grass was often covered with snails and wondered what the snails were eating.

Through a series of cage experiments, Silliman demonstrated that the snails control the amount of marsh grass by facilitating a fungal infection that impedes the growth of the grass.

He also showed that blue crabs control the number of snails and thereby protect the marsh grass from overgrazing.

This is an excellent example of how confronting a long-held assumption with data can refine our understanding of the natural world.