stats Flashcards

1
Q

tables

A

enables the reader to quickly and easily find the actual data values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

a disadvantage of tables

A

do not lend themselves to composing a mental picture of the trends occuring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

graphs

A

show trends or patterns that are easily visualized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

line graphs are useful for

A

demonstrating principles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

independent variable

A

the horizontal axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

dependent variable

A

the vertical axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

figures

A

graphs, maps, photographs, and other illustrations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

frequency distribution

A

summarize data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

range

A

interval between the largest and smallest values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what classes are frequency distributions divided into

A

classes of equal size, and records class frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

histograms

A

useful diagrams to plot frequency distributions
special type of bar graph in which each vertical bar represents an interval of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

vertical or horizontal lines are allowed

A

partial horizontal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

regression

A

minimizes the subjectivity in determining relationships between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

plot B as a function of A

A

B will be on the y
A will be on the x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

exception for A going on the y

A

depth, height, altitude

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

bar graphs

A

wish to summarize quantities in different categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

stacked bar graph

A

sections of each bar are stacked on top of each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

mean

A

the average of all observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Xi

A

value of the ith observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

N

A

totally number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

variance

A

measure of variability of spread in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

formulation of hypothesis steps

A
  1. form null and alternative hypothesis
  2. figuring out which statistical test to use
    3.calculating the test statistic and degrees of freedom
  3. find critical value from a stats table
  4. rejected or accepting the null
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

differences between two means

A

t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

difference between several means

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
test of correlation
correlation
26
test of association
chi square
27
test of independence
chi square
28
discrete
cannot be represented by fractions or numbers classes or categories for example: gender v major
29
continuous
can be represented by numbers and/or fractions
30
both independent and dependent variable are discrete
chi square
31
independent is discrete, dependent is continuous
t test
32
both are continuous
correlation
33
Test statistics usually are very large
when your null hypothesis is very wrong, and usually very small when your null hypothesis is correct.
34
p-value
the probability that your results or more extreme results than yours could have occurred due to chance even when the null hypothesis is actually true
35
tcalc > tcrit
reject the null
36
When you say that you reject the null hypothesis at p < 0.05, you are essentially saying
“There is a less than 5% chance that I could have gotten these results if the null hypothesis were true, so I would rather conclude that the null is not true than accept such an unlikely outcome.”
37
Paired t-test
the two samples are paired or dependent because they contain the same subjects. Conversely, an independent samples t test contains different subjects in the two samples
38
t test
used when the data of two samples are statistically independent
39
inductive reasoning
explanations/hypotheses/theories
40
deductive reasoning
predictions
41
high values tend to go
down
42
low values tend to
go up
43
flaws in statistical thinking
stats and probability are intuitive we tend to jump to conclusions we tend to be over confident we see patterns in random data we don't realize that coincidences are common we find it hard to combine probabilities: monty hall we are fooled by regression to the mean
44
replication
is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated.
45
statistical sample
The group of replicated measurements that is used to help estimate natural variability
46
population
the selection of a subset (a statistical sample) of individuals
47
random sample
all individuals in a population should have an equal probability of being selected so that the proportions sampled can help us estimate the probability that similar samples would occur in the future
48
sampling bias
If something about the sampling process causes a particular type of individual in the population to be more likely to be sampled
49
Sampling error
It is the amount by which samples will differ due to chance
50
Systematic sampling
techniques attempt to overcome this problem by “using information about the population” to choose a more representative sample
51
Sample size determination
is the act of choosing the number of observations or replicates to include in a statistical sample
52
Power
the power of a statistical test is the probability that the test will detect: 1) a pattern in the population if the pattern truly exists or 2) the effect of a specific condition on the population if the effect truly exists
53
effect size
A strong pattern
54
Pseudoreplication
includes experimental designs in which treatments are not replicated (though samples may be) or replicates are not statistically independent resulting in an inflation of the reported number of samples or replicates.
55
bivariate
dealing with two variables, usually an independent variable and a dependent variable
56
Multivariate statistics
will deal with more than two variables (e.g. three or more dependent variables, or any combination of multiple independent and dependent variable
57
community ecology
the study of populations of two or more different species occupying the same geographical area and in a particular time
58
Classification
the placement of species and/or sampled locations into groups used to distinguish different kinds of communities from each other if there appear to be clear distinctions
59
ordination
the arrangement or ‘ordering’ of species and/or sample locations along environmental gradients can be more useful when there are not clearly distinct kinds of communities because they grade into one another with fuzzy boundaries
60
community data matrix
has taxa (usually species) as rows and samples as columns (Table 1) or vice versa
61
Principal Components Analysis
takes your cloud of data points, and rotates it such that the maximum variability is visible. Another way of saying this is that it identifies your most important differences.
62
Detrended Correspondence Analysis
DCA only represents the patterns of dependent variables (species abundance) but does not directly compare the species abundances to the possible independent variables that cause them We could use the DCA to make hypotheses about the causes of the species distributions
63
triplot
It is called a triplot because it simultaneously displays three pieces of information: samples as points, species as points, and environmental variables as lines.
64
Nonparametric statistics
include several different statistical methods in which the data are not assumed to come from prescribed models that are ‘custom fit’ to the data by a small number of parameters
65
parametric statistics
use general model descriptions associated with 1 or more numerical parameters, which can be adjusted to allow the models to be applied to a variety of data sets ex:normal distribution model, the Poisson distribution model, and the binomial distribution model
66
permutations
These permutations keep the actual data intact, but randomly associate the environmental data with the species data
67
Direct Gradient Analysis (DGA)
Thus, DGA is best coupled with an ordination (multivariate) technique like CCA.
68
canonical correspondence analysis
If we directly include environmental variables as independent variables we are changing our DCA into a CCA
69
centroid
basically the center of a cloud of points
70
PCA 1
a ‘best fit’ line for the cloud of points
71
eigenvalue
they are ranked from the highest to the lowest These are related to the amount of variation explained by each axis
72
95% confidence interval
is a range of values that has a 95% chance of containing the true single value that you are trying to estimate.
73
Poisson distribution
The random distribution of numbers of sightings
74
binomial distribution
two parameters t you are distinguishing between two (and only two) possible outcome
75
intervals overlap
not different
76
The standard statistical technique to detect a nonrandom relationship between two continuous variables
correlation
77
species richness is a
discrete variable