Stats 5 - Writing Statistics Flashcards
What are descriptive statistics what should you report?
Descriptive Statistics - Important characteristics:
- Define sample size (n)
- Provide measures of central tendency and spread (depending on the data type – normal, Poisson or binomial)
Descriptive statistics is reported at the start of the results
When reporting a One or Two sample T-Test what should you report?
Reminder
- One-sample t-test compares your sample mean to a global mean
- Two-sample t-test compares the means of two samples
Reporting
- What test you performed? Why you performed it?
- Report results –> Statistically different or not? –> Include:
- T-Value
- df
- p-value
When reporting a simpel F-test to compare variances what should you report?
Reminder:
- Simple F-test compares the variance between two
- Useful way to investigate the homogeneity of variance assumption for two-sample t-tests, as homogenous variance is a prerequisite for the two-sample t-test.
Reporting
- What test you performed and why?
- Report the results –> variation is statistically different or not? –> Include F-Value, df and p-value
When reporting a Wilcoxon test, what should you include?
Wilcoxon Test - Remember:
- A Wilcoxon test compares either two sample medians or a sample median to a global median (Non-Normal T-Test)
- The comparison is great for non-normally distributed data, such as count or proportion data.
Reporting
- What test you performed and why?
- Result of the test –> Median is statistically significantly or not? –> Include W-value and P-value
When reporting a One-way analysis of variance, what should you report?
One-way Analysis of Variance - Remember:
- An analysis of variance compares between-group variance to within-group variance to discern whether the grouping factor(categorical) – here “Climate” – has an effect on the response variable (log Mass in grams).
Basically, used to decide whether a term explains a significant amount of observed variation
Reporting
- What test you performed and why?
- Report results - did you term explain a significant amount of the variation –> Report the F value, Df for Sum sq. + residuals, and P-value
Note –> you can follow up with a Tukey test to investigate the intricacies
When reporting a correlation test, what should you include?
Correlation test - Reminder:
- A correlation looks at the association/correlation of two variables. It can NOT be used to infer causation –> we don’t which one is the independent or dependent variable, we just know they are associated
- Pearson’s correlation test is used for two normally distributed variables and Spearman’s fr two non-normal variables.
Reporting
- What test you performed and why?
- Report result –> what correlation/association was found? –> Include correlation score, T-value or S-value, df and p-value
When reporting a simple linear regression, what should you include?
Simple Linear Regression - Reminder:
- A linear regression examines the effect of a continuous explanatory variable on a continuous response –> Continuous Response ~ continuous explanatory variable
- Different to correlation tests, because a simple linear regression is used to define causality by fitting a linear line with an intercept and slope – we assign independent and dependent variables.
Reporting
- What test you performed and why?
- Did you find a signficant regression? –> include F-Value, Df, p-value and adjusted R2
- Propose equation using co-efficients given that it is signifcant
Note -> You may also may consider including the Coefficient table –> more formal
When reporting a two-way analysis of variance, what should you include?
Reminder:
- A two-way analysis of variance compares the between-group variance and within-group variance for main effects (a single grouping factor) and interaction effects (two or more interacting grouping factors)
Basically…
Looking at multiple categorical groups (factors) and their interaction –> how much variation is explained by each term
Reporting
- What test was performed + justification
- Report results from the Two-way analysis of variance test –> Include F-Value, df and P-value.
Note - You can follow up Tukey HSD to perform pairwise comparison
When reporting a analysis of Covariance, what should you include?
Reminder:
- An analysis of covariance compares the between-group variance and the within-group variance for a grouping factor of interest whilst controlling for a continuous covariate. Your primary interest is on the group variable not the continuous covariate
- Note – You can apply an ANCOVA for additive models and interactive models
Reporting
- What test was performed + justification
- Report results from the ANCOVA –> F-Value, df and P-value –> Even though you are controlling for the continious variable you can also refer to it.
- Follow up post-hoc test –> looking at the intricacies between the climactic variables
When reporting a Multiple Linear regression, what should you include?
Reminder:
- A multiple linear regression examines the effect of more than one variable on a continuous response variable –> effect of multiple explanatory variables on continuous response variable
- Can include Categorical and Continious –> But emphasis is placed on continious
Reporting
- What test was performed + justification
- Report results from the multiple linear regression
a) Did you obtain a signficant regression equation? –> Report F-value, df, P-value and adjusted R2
b) if you did obtain a signifcant regression –> Outline the different regression equations - Interpretation/takeaway message –> Which regression equation showed the most effect on the response?
Example - The effect of mammal mass is therefore weaker in tropical than temperate climates
When reporting a Model selection/simplification process, what should you include?
Reminder
Model selection starts with maximal model and produces a minimum adequate model.
Reporting
- Outline the simplification procedure –> stepwise backwards model selection
- Outline maximal model
- Report minimal adequate model obtained (F-value for model as support + R2) –> including regression equations
- Interpretation of the linear equations obtained from the minimal adequate model –> putting the numbers in biological context
When statistical reporting, what does spin refer to?
Spin
Reported results differ from statistical results –> distorted interpretation of non-significant results
What are some example poor statisitcal practices that can still be found in the lierature?
- Selectively reporting outcomes
- Reporting statistical analyses that are NOT specified/different to previously specified analysis plans
- Graphs can NOT be interpreted unambiguously (only interpreted in one way –> Graphs that are NOT clear
- Reported results differ from statistical results –> distorted interpretation of non-significant results (spin) –> interpreting p-value between 0.05 and 0.10 as significant
- Summarizing data variability using standard errors/standard deviation of the mean.
- Not reporting exact P-Values for primary analyses + post-hoc tests
- Not plotting raw data to calculate variability
Why is using Standard error of mean (SEM) and Standard deviation (SD) a problem in papers?
Summarizing the data as mean and SE or SD often causes readers to wrongly infer that the data are normally distributed with no outliers.