exam 1 Flashcards
statistics
a field of mathematics that develops and studies methods to collect, analyze, interpret, and present empirical evidence
empirical vs anecdotal evidence
empirical - information received from the observation or measurements of patterns using experimentation
anecdotal - evidence collected in a casual or informal manner that relies heavily on personal testimony or conclusions (not statistical data collection)
data
a collection of numerical facts or information from which conclusions can be drawn
raw data
unformatted data (numerical measurements, instrument readings, text) that has not been processed or analyzed
replicates
parallel measurements of a phenomenon to estimate variability in your sample (the number of replicates = n)
sampling effort
how much data do we need?
precision and accuracy
precision - how fine the divisions on a scale of measurement are
accuracy - how close to the truth our measurement is
(accuracy is the priority)
descriptive statistics
quantitative description of observations sampled from a population (mathematically summarizing patterns, data centers, and variability without making conclusions about overall meaning of data)
data distribution (histogram)
sampled populations arranged by rank order and graphically presented
normal distribution
an arrangement of data in which most values cluster in the middle of the range and the rest taper off symmetrically toward either extreme
log-normal distribution
data are clustered at low values, but there are some much higher values (positive skew)
(can be made normal by applying a logarithm function)
central tendency
numeric value describing a central position in a dataset. mean, median, and mode are valid measures
skew
positive, negative, or normal
central limit theorem
if a population with finite variants is sufficiently sampled, the mean of all the samples from the population will be = approximately equal to the mean of the population, AND the means from the samples will approach a normal distribution
main steps in the scientific method
planning - what are you going to do? learn the system, develop ideas about how the system works (maybe do a pilot study), decide hypothesis, figure out what data you will need
recording - collect and properly accord data, can take many forms, must record extremely carefully
analysis - interrogate data to test hypothesis, analysis cannot be successful if data gathering was not designed with analysis in mind, should allow you to accept or reject null
reporting - disseminating methods and media will depend on the type of work and audience, statistical results must be reported using proper conventions, graphs must be properly labelled
types of data
continuous - data that can take any value (usually measured)
discrete - numerical data that can take a limited number of values (often counted)
ordinal - data in categories that can be placed in order, but magnitude of difference between categories is not fixed
categorical - data in categories that can’t be usefully ordered
null and alternative hypothesis
null hypothesis - no change (Ho)
alternative hypothesis - what you want to show (Ha or H1)
sampling strategies
random - best choice
systematic - transects (sampling on a created line)
mixed - stratified random sampling
haphazardly - when you are unable to randomly sample because of practicality
mean, median, mode
mean - sum of observations is divided by number of observations in the sample
median - the middle score for the sampled data that has been arranged by order of magnitude
mode - the most frequent score in a sampled dataset
(equations)
data in quartiles
divide data into quarters and use five number summary
steps -
rank data from smallest to largest
smallest is first number, largest is 5th
median is third
middle of first and third is second, middle of fifth and third is fourth
dividing n-1 to calculate variance
penalty for having a small amount of replicates
shapiro-wilk test and how to interpret
takes a data distribution and determines whether it is significantly different to normal
p-value of <.05 = not normal, reject Ho