Midtern Review Flashcards

1
Q

How are variables classified?

A

Value
Numerical or categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Continuous variables: give an example

A

infinite, usually containing fraction or decimals, uncountable Ex: cow weight, core body temp in dogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discrete variables

A

finite, usually integers, countable ex: # of eggs in a nest, # of star around a planet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are categorical variables?

A

isn’t numeric, data fits into categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can quantitative variables be broken broken down?

A

As either continuous or discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nominal variables, give an example

A

Have values that are named categories, ex: coat colors, biological sex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are categorical variables broken down?

A

Nominal or ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ordinal variables, give an example

A

ordered name categories. ex: stages of disease (cancer), levels of pain, BMI category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Independent variable

A
  1. effect, predictor or explanatory variable
  2. exert an influence on outcome you wish to measure
  3. can be actively manipulated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Depdendent variable

A
  1. Outcome or response variable
  2. What your measure or record
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Frequency

A

how often a data point shows up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a histogram show you?

A

Center, spread, shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Taxonomy of frequency histogram shapes (6)

A

a. symmetric, bell-shaped
b. symmetric, not bell-shaped
c. skewed to the right (positively skewed)
d. skewed to the left (negatively skewed)
e. negative exponential
f. bimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why look at frequency distributions?

A
  1. insight into sample
  2. detect outliers
  3. check assumptions of statistical tests
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does a bivariate scatterplot show?

A

The relationship between 2 quantitative variable, shows strength and direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the three measures of central tendency?

A

Mean, median, mode,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

4 Measures of Dispersion

A
  1. Range
  2. Mean deviation
  3. Standard deviation
  4. Variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define mean

A

average of the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Median

A

Middle measurement in set of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Draw and label a box plot

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the advantages of a box plot (make 4 points)

A
  1. visual representation
  2. comparison
  3. identify central tendency and spread
  4. identify outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the standard deviation (s)

A

The data spread, measures how far from the mean the observations typically are. Large = observations farther from mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Variance = s^2

A

Used to calculate the SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Statistical population

A

Aggregate of all units under study, has the actual mean, SD, population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Sample population

A

The specific group you will collect data from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Define blocking in experiments, examples

A

Grouping experimental units into similar subsets, ex: location, family, genotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Describe two-step blocking procedure

A
  1. divide experimental unit in homogenous subsets
  2. randomly assign treatments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are poor sampling desgins?

A
  1. Haphazard sampling
  2. Convenience or opportunity sampling
  3. Pseudoreplication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Discuss pseudoreplication

A

when observations are not statistically indepdent but are treated as if they are
Results in altering of the sample size (n)
ex: treating multiple cells from the same animals as independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

2 benefits of random sampling

A
  1. unbiased
  2. high precision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Discuss high bias

A

Repeated samples give estimates that systematically diverge from the population parameter in the same fashion, aiming in the wrong place

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Frequency distribution

A

how often a specific value show up in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is a probability distribution

A

all possible values and distributions for a random variable in a given range

34
Q

Normal distribution (make 4 points)

A

1.most common
2.symmetry around the mean,
3. bell shaped,
4. 68-95-99.7 rule

35
Q

IQR

A

interquartile range, range of middle 50% of sample

36
Q

What does variance measure?

A

Variability from the average mean

37
Q

Standard deviation

A

Measure of how dispersed the data is about the mean

38
Q

Coefficient variation

A

measure of disperson of data points around the mean expressed as a percentage

39
Q

Confounding variable

A

Unmeasured third variable affecting both the independent and dependent variable

40
Q

Spurious Association

A

When two variables are correlated but don’t have a causal relationship

41
Q

Extraneous variables

A

Not measured, effects dependent variable

42
Q

Estimation

A

using sample data to make inferences about the population

43
Q

Point estimate

A

an exact value

44
Q

Interval estimate

A

A range of values for a parameter; gives an interval as an estimate for a parameter

45
Q

Confidence Interval

A

Likelihood interval estimate contains the true population parameter being estimated

46
Q

Central Limit Theorum

A

The distribution of the sample means approaches normal the larger the sample gets, regardless of the population’s distribution

47
Q

68-95-99.7 rule

A

68% w / in 1 SD
95% w/ in 2 SD
99.7% w/ in 3 SD

48
Q

Steps for Hypothesis Testing (for a t-test with pooled variances)

A
  1. State formal statistical hypothesis
    a)Biological question
    b )null hypothesis
    c) alternate Hypothesis
  2. Choose an appropriate statistical test, justify your choice
    a) 1 sample t -test
    H0: u = specifiec mean value
    HA: u ≠ specified mean value

b) independent samples t-test
H0: u1 = u 2
HA: u1 ≠ u2

c) paired dependent t-test
H0: udiff = 0
HA udiff ≠ 0

  1. Check normality of assumptions

Test of normality:
H0: sample data comes from a normal population distribution (p>0.05)
HA: sample data does not come from a normal population distribution
Shapiro-Wilk
Kolmogorov-Smirnov

Check homogenity of variances
2 sample dependent t-test
a) box plot
b) variance ratio
c)Levene’s test
H0: variance of 2 groups is equal
HA: variance of 2 groups is not equal

4) run analysis, comprare to reference set

5) evaluate evidence against null
p>0.05 reject
p<0.05 reject

6) write summary statement

49
Q

Type 1 error

A

Incorrectly reject true null hypothesis

false positive

50
Q

Type 2 error

A

H0 accepted but it’s false, false negative

51
Q

How to limit type 1 errors

A

Only reject H0 if alpha <0.05

52
Q

How to limit type 2 errors

A

Maximize statistical power

53
Q

When do you use a one-sample t-test?

A

when you want to compare the mean of a sample to a known or hypothesized population mean, and you only have data from a single sample

54
Q

What does the Independent samples t-test compare?

A

Compare means between two unrelated samples

55
Q

What does the paired sample dependent t-tests compare?

A

the means of two variables for a single group

56
Q

What do 2-tailed tests allow you to detect?

A

Allow you to detect differences in either direction

57
Q

Discuss 1-tailed tests

A

not common, must be specified before data is collected, detect difference in only one direction

58
Q

Name two tests of normality and discuss what they tell you

A

Test how well sample data fits a normal distribution
Shapiro-Wilk
Kolmogorov-Smirnov

59
Q

How to check the homogeneity of variances?

A
  1. side by side box plots
  2. calculate variance ratio (largest/smallest variance in spss)
  3. levene’s test
60
Q

What does Levene’s test compare? What is the H0 and HA for levene’s test? When do accept and reject the null hypothesis?

A

checks to see if samples to be compared come from population with same variance

H0 - the variance (the spread) of the two groups is the same

HA - the variance (the spread) of the two groups is not the same

p <0.05 reject null (two samples do not have equal variances)

p>0.05 accept null (two samples do have equal variances)

61
Q

how are degrees of freedom calculated?

A

sample size (n) minus the number of parameters estimated

62
Q

Why not do lots of t-tests?

A

Inflate type 1 error

63
Q

What is the statistical hypothesis for ANOVA?

A

A0: all means are equal
HA: not all means are equal

64
Q

2 teps of ANOVA

A

1) Global F-Test
2) Post-hoc tests

65
Q

What is the ANOVA test statistic ratio

A

between group variation:within group variation

66
Q

ANOVA summary statement

A

name of test, degrees of freedom, f statistic, p value

67
Q

The Nonparametric independent 2-sample t-test twin

A

Wilcoxon Mann-Whitney Rank Test

68
Q

What does Tukey’s HSD compare?

A

compares all possible pairs of means, tells which specific groups means are different (from each other)

69
Q

Transformation only changes the what?

A

Distribution of the values

70
Q

What tests to do if your data passes assumptions

A

1) t-test
2) ANOVA

71
Q

4 Qualities of Non-Parametric Test

A

1) no mean
2) no or fewer assumptions
3) not sensitive to outliers
4) based on ranks of data value

72
Q

Which has more statistical power: parametric or nonparametric

A

Parametric

73
Q

What type of error is a nonparametric test more likely to have?

A

Type II error - reject a false H0, less likely to detect true effect

74
Q

The nonparametric equivalent of the Dependent t-test

A

Wilcoxon-Signed Ranks Test

75
Q

The non-parametric equivalent of the ANOVA

A

Kruskal-Wallis Dunn’s (post-hoc)

76
Q

For ANOVA, what does it mean when
F = 0
F = 1
F is large

A

F = 0 groups are identical
F = 1 small difference among groups means
F is large = big among between groups means

77
Q

Advantages of Non-parametric tests (make 3 points)

A

1) more widely applicable,
2) not sensitive to outliers
3) generally any sample distribution OK

78
Q

Disadvantages of non-parametric tests

A

1) lower statistical power
2) if assumptions of parametric test mets, parametric tests more powerful

79
Q

State H0 and HA for a 1-sample t-test

A

H0: population mean = specified value
HA: population mean ≠ specified value

80
Q

State hypothesis for independent samples t-test

A

H0: u1 = u2
HA u1 ≠ u2

81
Q

State hypothesis for paired dependent t-tests

A

H0 udiff = 0
HA udiff ≠ 0

82
Q

H0 and HA for ANOVA

A

H0 = there is no difference between the means of the populations being studied
HA = there is a difference between the means of the population being studied