Statistics- Summarising and Presenting Data Flashcards
Why is summarising data important in research ?
- Clarity and understanding
- Hypothesis testing
- Efficient communication
- Resource management
- Meta analyses and generalisation
- Data integrity and reproducibility
- Development of theories
What is the role of statistics in data analysis?
Data are the raw material of knowledge
Provide us with techniques for:
- summaries and presenting the info contained in a data set
- handling and quantifying variation in the data, to help us infer what they will us about the underlying theory of interest
What the types of data?
- Categorical
- Quantitative
- Interval data
- Ratio data
What are types of categorical data
Nominal:
Categories with no specific order
( blood types)
Ordinal:
Categories with a meaningful order but no consistent differences between categories
( stages of cancer)
What are the types of quantitative data?
Discrete:
Countable values, often integers
(Hospital visits)
Continuous:
Data that can take any value within a range
(Blood pressure measurements)
What is interval data?
It is numerical data where the intervals between values are meaningful. However it lacks a true 0 point.
Examples:
Temperature in c or f
Dates in a calendar
What is ratio data?
It is numerical data with equal intervals between values and a true 0 point, allowing for calculation of ratios
Examples:
- height and weight
- duration (time taken to complete a task eg)
What are the ways to summarise data?
- Measures of central tendency (MMM)
- Measures of spread (range and variance)
- Measures of shape
- Graphical summaries
- Summary tables
- Correlation and association
- Regression models
- Longitudinal data analysis
- Survival analysis
- Multivariate analysis
- Bayesian methods
- Advanced visualisation techniques
How to identify outliers in data
Assessing whether or not they fall within a set of bounds, inner or outer fences
Outside inner fences - minor outlier
Outside outer fences - major outlier
Inner (interquartile range (Q3-1) x 1.5, add this to Q3 and subtract from Q1)
Outer (interquartile range x 3, add this to Q3 and and subtract from Q1)
What is transformation
Sometimes beneficial to transform data to a different scale to aid interpretation and or statistical analysis
Reasons to transform:
- improved approximation to normality
- reducing skewness
- linearising the relationship between two variables
- making multiplicative relationships additive
Example-
Log transform stretched scale at lower end and compressed it at the upper end
Can only take logs of positive data
Important points for displaying data in a spreadsheet
- check twice that coding is correct
- check for incorrectly put numbers or information types
- check relevant research data matches your findings
- identify and develop methods for how you handle missing values
If it is ask checked, it is ready for analysis