Midtern Review Flashcards
How are variables classified?
Value
Numerical or categorical.
Continuous variables: give an example
infinite, usually containing fraction or decimals, uncountable Ex: cow weight, core body temp in dogs
Discrete variables
finite, usually integers, countable ex: # of eggs in a nest, # of star around a planet
What are categorical variables?
isn’t numeric, data fits into categories
How can quantitative variables be broken broken down?
As either continuous or discrete
Nominal variables, give an example
Have values that are named categories, ex: coat colors, biological sex
How are categorical variables broken down?
Nominal or ordinal
Ordinal variables, give an example
ordered name categories. ex: stages of disease (cancer), levels of pain, BMI category
Independent variable
- effect, predictor or explanatory variable
- exert an influence on outcome you wish to measure
- can be actively manipulated
Depdendent variable
- Outcome or response variable
- What your measure or record
Frequency
how often a data point shows up
What does a histogram show you?
Center, spread, shape
Taxonomy of frequency histogram shapes (6)
a. symmetric, bell-shaped
b. symmetric, not bell-shaped
c. skewed to the right (positively skewed)
d. skewed to the left (negatively skewed)
e. negative exponential
f. bimodal
Why look at frequency distributions?
- insight into sample
- detect outliers
- check assumptions of statistical tests
What does a bivariate scatterplot show?
The relationship between 2 quantitative variable, shows strength and direction
What are the three measures of central tendency?
Mean, median, mode,
4 Measures of Dispersion
- Range
- Mean deviation
- Standard deviation
- Variance
Define mean
average of the data set
Median
Middle measurement in set of observations
Draw and label a box plot
What are the advantages of a box plot (make 4 points)
- visual representation
- comparison
- identify central tendency and spread
- identify outliers
What is the standard deviation (s)
The data spread, measures how far from the mean the observations typically are. Large = observations farther from mean.
Variance = s^2
Used to calculate the SD
Statistical population
Aggregate of all units under study, has the actual mean, SD, population parameters
Sample population
The specific group you will collect data from
Define blocking in experiments, examples
Grouping experimental units into similar subsets, ex: location, family, genotype
Describe two-step blocking procedure
- divide experimental unit in homogenous subsets
- randomly assign treatments
What are poor sampling desgins?
- Haphazard sampling
- Convenience or opportunity sampling
- Pseudoreplication
Discuss pseudoreplication
when observations are not statistically indepdent but are treated as if they are
Results in altering of the sample size (n)
ex: treating multiple cells from the same animals as independent
2 benefits of random sampling
- unbiased
- high precision
Discuss high bias
Repeated samples give estimates that systematically diverge from the population parameter in the same fashion, aiming in the wrong place
Frequency distribution
how often a specific value show up in a data set