stats Flashcards

1
Q

tables

A

enables the reader to quickly and easily find the actual data values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

a disadvantage of tables

A

do not lend themselves to composing a mental picture of the trends occuring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

graphs

A

show trends or patterns that are easily visualized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

line graphs are useful for

A

demonstrating principles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

independent variable

A

the horizontal axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

dependent variable

A

the vertical axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

figures

A

graphs, maps, photographs, and other illustrations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

frequency distribution

A

summarize data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

range

A

interval between the largest and smallest values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what classes are frequency distributions divided into

A

classes of equal size, and records class frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

histograms

A

useful diagrams to plot frequency distributions
special type of bar graph in which each vertical bar represents an interval of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

vertical or horizontal lines are allowed

A

partial horizontal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

regression

A

minimizes the subjectivity in determining relationships between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

plot B as a function of A

A

B will be on the y
A will be on the x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

exception for A going on the y

A

depth, height, altitude

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

bar graphs

A

wish to summarize quantities in different categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

stacked bar graph

A

sections of each bar are stacked on top of each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

mean

A

the average of all observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Xi

A

value of the ith observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

N

A

totally number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

variance

A

measure of variability of spread in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

formulation of hypothesis steps

A
  1. form null and alternative hypothesis
  2. figuring out which statistical test to use
    3.calculating the test statistic and degrees of freedom
  3. find critical value from a stats table
  4. rejected or accepting the null
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

differences between two means

A

t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

difference between several means

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

test of correlation

A

correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

test of association

A

chi square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

test of independence

A

chi square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

discrete

A

cannot be represented by fractions or numbers
classes or categories for example: gender v major

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

continuous

A

can be represented by numbers and/or fractions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

both independent and dependent variable are discrete

A

chi square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

independent is discrete, dependent is continuous

A

t test

32
Q

both are continuous

A

correlation

33
Q

Test statistics usually are very large

A

when your null hypothesis is very wrong, and usually very small when your null hypothesis is correct.

34
Q

p-value

A

the probability that your results or more extreme results than yours could have occurred due to chance even when the null hypothesis is actually true

35
Q

tcalc > tcrit

A

reject the null

36
Q

When you say that you reject the null hypothesis at p < 0.05, you are essentially saying

A

“There is a less than 5% chance that I could have gotten these results if the null hypothesis were true, so I would rather conclude that the null is not true than accept such an unlikely outcome.”

37
Q

Paired t-test

A

the two samples are paired or dependent because they contain the same subjects. Conversely, an independent samples t test contains different subjects in the two samples

38
Q

t test

A

used when the data of two samples are statistically independent

39
Q

inductive reasoning

A

explanations/hypotheses/theories

40
Q

deductive reasoning

A

predictions

41
Q

high values tend to go

A

down

42
Q

low values tend to

A

go up

43
Q

flaws in statistical thinking

A

stats and probability are intuitive
we tend to jump to conclusions
we tend to be over confident
we see patterns in random data
we don’t realize that coincidences are common
we find it hard to combine probabilities: monty hall
we are fooled by regression to the mean

44
Q

replication

A

is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated.

45
Q

statistical sample

A

The group of replicated measurements that is used to help estimate natural variability

46
Q

population

A

the selection of a subset (a statistical sample) of individuals

47
Q

random sample

A

all individuals in a population should have an equal probability of being selected so that the proportions sampled can help us estimate the probability that similar samples would occur in the future

48
Q

sampling bias

A

If something about the sampling process causes a particular type of individual in the population to be more likely to be sampled

49
Q

Sampling error

A

It is the amount by which samples will differ due to chance

50
Q

Systematic sampling

A

techniques attempt to overcome this problem by “using information about the population” to choose a more representative sample

51
Q

Sample size determination

A

is the act of choosing the number of observations or replicates to include in a statistical sample

52
Q

Power

A

the power of a statistical test is the probability that the test will detect: 1) a pattern in the population if the pattern truly exists or 2) the effect of a specific condition on the population if the effect truly exists

53
Q

effect size

A

A strong pattern

54
Q

Pseudoreplication

A

includes experimental designs in which treatments are not replicated (though samples may be) or replicates are not statistically independent resulting in an inflation of the reported number of samples or replicates.

55
Q

bivariate

A

dealing with two variables, usually an independent variable and a dependent variable

56
Q

Multivariate statistics

A

will deal with more than two variables (e.g. three or more dependent variables, or any combination of multiple independent and dependent variable

57
Q

community ecology

A

the study of populations of two or more different species occupying the same geographical area and in a particular time

58
Q

Classification

A

the placement of species and/or sampled locations into groups
used to distinguish different kinds of communities from each other if there appear to be clear distinctions

59
Q

ordination

A

the arrangement or ‘ordering’ of species and/or sample locations along environmental gradients
can be more useful when there are not clearly distinct kinds of communities because they grade into one another with fuzzy boundaries

60
Q

community data matrix

A

has taxa (usually species) as rows and samples as columns (Table 1) or vice versa

61
Q

Principal Components Analysis

A

takes your cloud of data points, and rotates it such that the maximum variability is visible. Another way of saying this is that it identifies your most important differences.

62
Q

Detrended Correspondence Analysis

A

DCA only represents the patterns of dependent variables (species abundance) but does not directly compare the species abundances to the possible independent variables that cause them
We could use the DCA to make hypotheses about the causes of the species distributions

63
Q

triplot

A

It is called a triplot because it simultaneously displays three pieces of information: samples as points, species as points, and environmental variables as lines.

64
Q

Nonparametric statistics

A

include several different statistical methods in which the data are not assumed to come from prescribed models that are ‘custom fit’ to the data by a small number of parameters

65
Q

parametric statistics

A

use general model descriptions associated with 1 or more numerical parameters, which can be adjusted to allow the models to be applied to a variety of data sets ex:normal distribution model, the Poisson distribution model, and the binomial distribution model

66
Q

permutations

A

These permutations keep the actual data intact, but randomly associate the environmental data with the species data

67
Q

Direct Gradient Analysis (DGA)

A

Thus, DGA is best coupled with an ordination (multivariate) technique like CCA.

68
Q

canonical correspondence analysis

A

If we directly include environmental variables as independent variables we are changing our DCA into a CCA

69
Q

centroid

A

basically the center of a cloud of points

70
Q

PCA 1

A

a ‘best fit’ line for the cloud of points

71
Q

eigenvalue

A

they are ranked from the highest to the lowest
These are related to the amount of variation explained by each axis

72
Q

95% confidence interval

A

is a range of values that has a 95% chance of containing the true single value that you are trying to estimate.

73
Q

Poisson distribution

A

The random distribution of numbers of sightings

74
Q

binomial distribution

A

two parameters
t you are distinguishing between two (and only two) possible outcome

75
Q

intervals overlap

A

not different

76
Q

The standard statistical technique to detect a nonrandom relationship between two continuous variables

A

correlation

77
Q

species richness is a

A

discrete variable