Hypotheses, Prediction, Scientific Method Flashcards
what are hypothesis testing alternatives?
Bayesian methods and likelihood techniques
what are the main characteristics of “new” hypothesis testing alternatives?
prior information
several hypotheses
inferences from multiple models
parameters which refer to biological processes/states
how can the scientist use his knowledge in these “new” hypothesis testing alternatives?
the knowledge of the scientist is used to formulate hypotheses (testable ideas) and select models that represent them
how detailed should a model be?
this can be objectively assessed depending on the data available to estimate parameters for model
what is the probability of observing this data if the null hypothesis was true? (which statistical method is used to answer this question?)
we answer that with p-value, which shows the probability of observing current or more extreme values considering that the null hypothesis is true
what is missed by a p-value based analysis?
does not evaluate if hypothesis/model is strong; as it only evaluates chance of extreme events
what is the likelihood of competing hypotheses considering the data?
proportional to likelihood of data given the hypotheses
give 3 characteristics of likelihood methods for testing hypotheses
competing models can have different parameters data is fixed and hypothesis is variable
relative support of each model given the data
what is goal and main assumption of maximum likelihood estimations?
- goal: find parameter estimates which give a distribution which will in turn maximize chance of observing data
- assumption: considering that each observation is independent, then the total probability of observing all of data is the product of observing each data point individually
how to evaluate likelihood models?
by AIC (Akaike’s Information Criterion): the best model loses less information when attempting to approximate reality (lowest AIC value); assessment of a given model depends on performance of other models which are being compared together
what are Akaike weights for?
to account for uncertainty (if we selected these models many times, what is the probability of model x being the best one?)
compare a traditional hypothesis testing with a likelihood method in terms of how groups are compared
instead of comparing means, we compare competing hypotheses in terms of how much support there is in the data for effects of each treatment; can be multivariate
what is the main difference between bayesian and likelihood methods?
pre-existing info are probabilities instead of likelihoods
describe a Bayesian analysis
previous knowledge for each parameter is a requirement; prior = pre-existing datasets; new knowledge from new set of data = previous knowledge modified by what was learned from the new data
what is a unique feature of bayesian methods?
output of Bayesian analyses: probabilities of hypotheses (no other method does it)
describe output for frequentist, likelihood, and bayesian methods
frequentist stats say that if experiment is repeated many times, parameter estimate would fall in the 95% confidence interval 95% of the times
bayesian: probabilities of hypotheses (no other method does it)
likelihoods are proportionate to probabilities
what is a confidence interval?
range of values we are fairly sure our true value lies in
confidence, in statistics, is another way to describe probability. exemplify this.
if you construct a confidence interval with a 95% confidence level, you are confident that 95 out of 100 times the estimate will fall between the upper and lower values specified by the confidence interval
what happens with the confidence interval if I am comparing a highly variable group with a uniform group?
the confidence interval will be narrower for the uniform group (more certainty)
which method answers these questions: what is the odd of alternative results? What are the expected values of alternative decisions? Or the frequency distribution of results?
these are questions which can only be answered by Bayesian methods
compare likelihood and bayesian methods in terms of parameters
likelihood: maximization of possible values across parameters; Bayesian: integration across parameters
problem with Bayesian methods: when we don’t know much about parameters; what is the solution?
meta-analysis: summarizing pre-existing knowledge
what are meta-analysis types in terms of parameter estimation?
stimate the value of a parameter based on several studies; several parameters whose values are different in each population (can be visualized in a histogram)
what is a problem with meta-analyses in terms of how samples are taken?
we can’t guarantee that the samples were randomly taken (eg., more productive samples usually are chosen)
why are ecological data and conclusions different than other fields?
natural variability and complex interactions interferes with results, is of interest to us, and makes data analysis a complex task
instead of having a single correct hypothesis, we usually end up with several partially correct hypotheses. Thus, we obtain a solid understanding of a phenomenon by looking at several studies over time instead of looking at a single one (eg., how trophic interactions shape communities)
why are new statistical methods more adequate for ecological studies?
ecological processes are characterized by several forces acting together; each portion of this complex system will have hypotheses that are more or less true. Thus, new statistics (likelihoods and Bayesian probabilities) are more aligned with ecology because they allow us to evaluate several hypotheses in terms of their relative importance to explain a given process
what can be the contribution of new statistics to the precautionary principle and for adaptive management of natural resources?
- Precautionary principle: practice is forbidden unless you can prove inexistence of effects; this is questionable: how large should an effect be to justify regulation? New statistics can help understand the probability of competing hypotheses and the size of the effects.
- Adaptive management of natural resources: improved data (future, after harvest for example) is estimated with Bayesian methods
what was the original idea behind p-values?
could my results be produced by random chance?
what is a null hypothesis?
what we want to disprove
what is a p-value?
what are the chances of getting results at least as extreme as the ones I actually got? This is the p-value (the smallest the value, the greater the chance of the null hypothesis being false)
p-value of 0.01 means that there is a…
99% chance of value falling in the 99% confidence interval; it summarizes the data given one specific null hypothesis
does a p-value of 0.01 mean that there is only 1% of accepted hypothesis being wrong?
no, as it does not calculate the odds of an effect being true in the first place
describe a problem with p-value
deviates attention from effect size analysis
what is p-hacking?
dropping conditions or doing other manipulations so that significant p-value is achieved
what are the recommendations for analyzing data moving forward from traditional p-value based analyses?
- Confidence intervals (relative importance) and effect sizes (magnitude of effect)
- move to Bayesian methods (probabilities of outcome instead of potential frequency of outcome)
- try several methods and compare results
- disclosure written on paper (how was sample size determined? Was data excluded? How were manipulations and measures made?)
- use an open science framework database to publish results of a first round of experiments, then replicate studies and publish
complete the sentence for traditional hypotheses testing: what is the chance of observing current effects (or more extreme)…
if the true effect is zero?
when we want to falsify a hypothesis, what analysis is appropriate?
p-values
what is a difficulty of likelihood techniques?
how to select candidate set of models to represent hypotheses?
what is the problem when analyzing with small populations in a frequentist approach?
a small decrease in pop growth, for example, will be likely be non-significant (small statistical power); this would be better analyzed as an exercise of parameter estimation only
when are frequentist approaches useful?
frequentist approaches are useful for carefully designed experiments which aim to falsify a given hypothesis
what are three distinct goals for statistical modeling in ecology?
data exploration, inference, and prediction
Populations, communities, and ecosystem-level
properties fluctuate through time in response to internal
and external drivers. Describe them.
Internal drivers include intraspecific density dependence, demographic stochasticity, interspecific interactions, and food web dynamics. External drivers are typically related to environmental conditions,
and weather is perhaps the most variable.
Why is quantifying the relative impacts of internal forcing versus weather on ecological dynamics a core goal of ecology with
new relevance?
As we attempt to predict ecological responses to climate change.
what is the goal of exploratory studies?
describe patterns in the data and generate hypotheses about nature
what is a type 1 error? how can it happen during exploratory analysis?
making false discoveries. it can happen when trying to indiscriminately combinate many potential covariates (a few of them will be highly correlated by chance)
what are the 3 good practices for exploratory analyses?
only consider plausible relationships
correct p-values for multiple testing to reduce chance of false discoveries
communicate exploratory nature of analysis (proposed hypotheses will only be valid if confirmed by subsequent independent studies)
what is inference analysis?
evaluate the strength of evidence in a data set for some statement about nature
what is the main difference between exploratory and inferential analysis?
exploratory: hypotheses are generated
inference: tests hypotheses based on a priori knowledge of ecological systems; eg. null hypothesis significance testing
we need independent datasets for both of these techniques
are conclusions obtained via statistical inference automatically accepted as scientific fact?
no, as the reliability of statements is tied to a particular set of data obtained by particular methods under particular conditions; thus, it requires replication and validation across a range of conditions before they are accepted as scientific facts
what happens when we are comparing too many models in inferential analyses?
the true goal of the analyses may be exploratory or predictive. inferential analyses have a small risk of false discoveries only when a small set of groups (2 or 3) are compared.
what is distinguishing feature of modeling for prediction?
the need to test predictive models “out of sample”: using independent data that were not used to fit the model. Exploratory modeling can help identify important predictors even if the causal mechanisms are currently not understood, but without validation on independent data, the predictive skill of the model is unknown.
what is the difference between predictive and inferential analyses when looking at a linear model?
predictive analyses focus on the response variable (which model will best predict y values for new observations of covariates in x?
inferential analyses focus on estimates of the regression coefficients (which ones are positive, negative, non-zero? which covariates are more important?)
what are the 2 main components of the AIC (Akaike’s Information Criterion) formula?
number of parameters
maximum likelihood estimates for model parameters
what are the pitfalls of AIC analyses?
Proponents of AIC argue that this approach avoids arbitrary P-value cut-offs (Burnham and Anderson 2002), but in practice researchers have relied on equally arbitrary and less interpretable cut-offs for the
difference in AIC values required to conclude that one model is more supported by the data than another.
what is regularization (eg., 2p in AIC equations)? what would happen if it was not there?
2p = regulates model complexity by penalizing models with too many parameters
without this penalty, models with the highest number of parameters would always be the chosen as the best ones (more complex models)
define regularization.
regularization involves selecting a model or estimating parameter values based on a weighted average of a goodness-of-fit measure and a complexity penalty.
what is model validation? why is it important?
comparing model predictions with observations that were not used to
“train,” or fit, the model. Out-of-sample validation is important because it is easy for a model to reproduce patterns in the training data, but much harder to accurately predict the outcome in another situation, which is what we expect predictive models to do.
what is the process of out-of-sample validation like?
- randomly splitting the available data into training and validation sets
- we then fit candidate models using the training data, use
each model to make predictions for the validation data,
and quantify model errors using a summary measure
such as root mean squared error.
what can we do if our dataset is too small to do an out-of-sample validation technique?
when data sets are too small for out-of-sample validation,
we can use cross-validation
how can we screen for potentially important covariates when doing an exploratory analysis?
we can calculate the correlation coefficient between population growth rate and each covariate and establish an arbitrary cut-off value (eg., 0.3)
describe the steps for exploratory analysis
- define the research question (eg., which covariates are associated with population growth rates)
- screen many covariates for potential associations (eg., correlation coefficients with arbitrary threshold)
- assess statistical evidence more rigorously (linear mixed-models including covariates with high coefficient; followed by likelihood methods comparing full model with reduced versions to exclude covariates which do not change full model results)
- fit final model and check for inconsistencies (eg., estimates are positive in Nov but negative in Dec?)
- correct for multiple comparisons (control for expected proportion of of false significant results)
- consider alternative approaches (sliding climate windows to average weather data from a group of periods instead of analyzing periods individually)
describe the steps for inference.
formulate competing hypotheses: priori hypotheses based on the
literature or deduced from theory
translate hypotheses into alternative models: eg. null-hypothesis significance testing
describe the steps for prediction.
- define the predictive goal: eg., objective is to predict butterfly population size (on log scale) at time t given data on log population size in year t − 1 and information about weather during the period in between the observations
- choose model selection approach (eg., regularization or AIC?)
- choose model validation approach (eg., select sub dataset as training data and validation data from dataset)
does an understanding of the process translate into improved predictive skill?
- not necessarily, as using covariates from an inference study can lead to loss of predictive accuracy, for example
give an example to explain why a given predictive model testing effects of climate on population growth may be weak
a given population may be responding to many weak climate signals instead of some stronger ones
describe classical errors for each exploratory, inference, and predictive studies.
exploratory: overfitting, coming up with fake discoveries
inference: misrepresenting exploratory analyses as inference studies
predictive: not testing predictive power of models in new, independent data
while in traditional statistics we are aiming at establishing significant differences among means, what kind of comparisons can we make using new methods (maximum likelihood and bayesian)?
because p-values in traditional analysis do not estimate evidence, new techniques are more appropriate for finding in the data the relative support for various competing hypotheses regarding the effects of treatment.