Midtern Review Flashcards

1
Q

How are variables classified?

A

Value
Numerical or categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Continuous variables: give an example

A

infinite, usually containing fraction or decimals, uncountable Ex: cow weight, core body temp in dogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discrete variables

A

finite, usually integers, countable ex: # of eggs in a nest, # of star around a planet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are categorical variables?

A

isn’t numeric, data fits into categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can quantitative variables be broken broken down?

A

As either continuous or discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nominal variables, give an example

A

Have values that are named categories, ex: coat colors, biological sex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are categorical variables broken down?

A

Nominal or ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ordinal variables, give an example

A

ordered name categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Independent variable

A
  1. effect, predictor or explanatory variable
  2. exert an influence on outcome you wish to measure
  3. can be actively manipulated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Depdendent variable

A
  1. Outcome or response variable
  2. What your measure or record
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Frequency

A

how often a data point shows up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a histogram show you?

A

Center, spread, shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Taxonomy of frequency histogram shapes (6)

A

a. symmetric, bell-shaped
b. symmetric, not bell-shaped
c. skewed to the right (positively skewed)
d. skewed to the left (negatively skewed)
e. negative exponential
f. bimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why look at frequency distributions?

A
  1. insight into sample
  2. detect outliers
  3. check assumptions of statistical tests
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does a bivariate scatterplot show?

A

The relationship between 2 quantitative variable, shows strength and direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the thress measures of central tendency?

A

Mean, median, mode,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

5 Measures of Dispersion

A
  1. Range
  2. Mean deviation
  3. Standard deviation
  4. Variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define mean

A

average of the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Median

A

Middle measurement in set of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Draw and label a box plot

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the advantages of a box plot

A
  1. visual representation
  2. comparison
  3. identify central tendency and spready
  4. identify outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the standard deviation (s)

A

The data spread, measures how far from the mean the observations typically are. Large = observations farther from mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Variance = s^2

A

Used to calculate the SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Statistical population

A

Aggregate of all units under study, has the actual mean, SD, population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Sample population

A

The specific group you will collect data from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Define blocking in experiments, examples

A

Grouping experimental units into similar subsets, ex: location, family, genotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Describe two-step blocking procedure

A
  1. divide experimental unit in homogenous subsets
  2. randomly assign treatments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are poor sampling desgins?

A
  1. Haphazard sampling
  2. Convenience or opportunity sampling
  3. Pseudoreplication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Discuss pseudoreplication

A

Occurs whenever individual measurements that are not independent are analyzed as if they are independent.
Results in altering of the sample size (n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

2 benefits of random sampling

A
  1. unbiased
    2 high precision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Discuss high bias

A

Repeated samples give estimates that systematically diverge from the population parameter in the same fashion

32
Q

Frequency distribution

A

how often a specific value show up in a data set

33
Q

Probability distribution

A

all possible values and distributions for a random variable in a given range

34
Q

Normal distribution

A

most common, symmetry around the mean, bell shaped, 68-95-99.7 rule

35
Q

IQR

A

interquartile range, range of middle 50% of sample

36
Q

What does variance measure?

A

Variability from the average mean

37
Q

Standard deviation

A

Measure of how dispersed the data is about the mean

38
Q

Coefficient variation

A

measure of disperson of data points around the mean expressed as a percentage

39
Q

Confounding variable

A

Unmeasured third variable affecting both the independent and dependent variable

40
Q

Spurious Association

A

When two variables are correlated but don’t have a causal relationship

41
Q

Extraneous variables

A

Not measured, effects dependent variable

42
Q

Estimation

A

using sample data to maker inferences about the population

43
Q

Point estimate

A

an exact value

44
Q

Interval estimate

A

A range of values for a parameter; gives an interval as an estimate for a parameter

45
Q

Confidence Interval

A

Likelihood interval estimate contains the true population parameter being estimated

46
Q

Central Limit Theorum

A

The distribution of the sample means approaches normal the larger the sample gets, regardless of the population’s distribution

47
Q

68-95-99.7 rule

A

68% w / in 1 SD
95% w/ in 2 SD
99.7% w/ in 3 SD

48
Q

Steps for Hypothesis Testing (for a t-test with pooled variances)

A
  1. State formal statistical hypothesis
    a)Biological question
    b )null hypothesis
    c) alternate Hypothesis
  2. Choose an appropriate statistical test, justify your choice
    a) 1 sample t -test
    H0: u = specifiec mean value
    HA: u ≠ specified mean value

b) independent samples t-test
H0: u1 = u 2
HA: u1 ≠ u2

c) paired dependent t-test
H0: udiff = 0
HA udiff ≠ 0

  1. Check normality of assumptions

Test of normality:
H0: sample data comes from a normal population distribution (p>0.05)
HA: sample data does not come from a normal population distribution
Shapiro-Wilk
Kolmogorov-Smirnov

Check homogenity of variances
2 sample dependent t-test
a) box plot
b) variance ratio
c)Levene’s test
H0: variance of 2 groups is equal
HA: variance of 2 groups is not equal

4) run analysis, comprare to reference set

5) evaluate evidence against null
p>0.05 reject
p<0.05 reject

6) write summary statement

49
Q

Type 1 error

A

Incorrectly reject true null hypothesis

50
Q

Type 2 error

A

H0 accepted but it’s false

51
Q

How to limit type 1 errors

A

Only reject H0 if alpha <0.05

52
Q

How to limit type 2 errors

A

Maximize statistical power

53
Q

When do you use a one-sample t-test?

A

Compare mean of a random sample from normal population to mean expressed in H0

54
Q

Independent samples t-test

A

Compare means between two unrelated samples

55
Q

Paired sample or dependent t-tests

A

compare 2 samples with data in matched pairs

56
Q

2-tailed tests

A

Allow you to detect differences in either direction

57
Q

1-tailed tests

A

not common, must be specified. before data is collected, detect difference in only one direction

58
Q

tests of normality

A

Test how well sample data fits a normal distribution
Shapiro-Wilk
Kolmogorov-Smirnov

59
Q

How to check the homogeneity of variances?

A
  1. side by side box plots
  2. calculate variance ratio (largest/smallest variance in spss)
  3. levene’s test
60
Q

Levene’s test

A

compares the absolute deviation of data values within each sample
HA - the variance (the spread) of the two groups is not the same
H0 - the variance (the spread( of the two groups is the same
p <0.05 reject null (two samples do not have equal variances)
p>0.05 accept null (two samples do have equal variances)

61
Q

how are degrees of freedom calculated

A

number of groups in your factor minus 1

62
Q

Why not do lots of t-tests?

A

Inflate type 1 error

63
Q

Statistical hypothesis for ANOVA

A

A0: all means are equal
HA: not all means are equal

64
Q

2 teps of ANOVA

A

1) Global F-Test
2) Post-hoc tests

65
Q

ANOVA test statistic ratio

A

between group variation/within group variation

66
Q

ANOVA summary statement

A

name of test, degrees of freedom, f statistic, p value

67
Q

Tukey’s HSD

A

compairs all possible means in a pairwise fashion, tells which specific groups means are different from each other

68
Q

Why use non-linear transformations?

A

Change histogram shape and meet test assumptions

69
Q

Transformation only changes the what?

A

Distribution of the values

70
Q

What to do if your data passes assumptions

A

1) t-test
2) ANOVA

71
Q

4 Qualities of Non-Parametric Test

A

1) no mean
2) no or fewer assumptions
3) not sensitive to outliers
4) based on ranks of data value

72
Q

Which has more statistical power: parametric or nonparametric

A

Parametric

73
Q

What type of error is a nonparametric test more likely to have?

A

Type II error - reject a false H0

74
Q

The Nonparametric independent 2-sample t-test twin

A

Wilcoxon Mann-Whitney Rank Test

75
Q

The nonparametric equivalent of the Dependent t-test

A

Wilcoxon-Signed Ranks Test

76
Q

The non parametric equivalent of the ANOVA

A

Kruska-Wallis Dunn’s (post-hoc)