RESS I: Data Anaylsis #1 Flashcards
What are the 5 As of practicing EBM?
Ask, Acquire, Apply, Apprais and Assess
What is a variable?
A particular characteristic being studied.
What is a dataset?
A collection of variables and observations.
What is categorical data?
Can only be assigned to a number of distinct categories e.g. sex, blood type.
What is numerical data?
Data that can take a numerical values e.g. age, weight
How can categorical data be subdivided?
Nominal: No natural ordering e.g. sex or blood type
Ordinal: Data can be ordered e.g. severity or disease stage.
How can numerical data be subdivided?
Continuous: Data can take any value e.g. weight.
Discrete: Whole values only e.g. number fo hospital visits.
What type of data is:
- Weight
- Sex
- Number of children
- Symptoms
- Disease Stage
- Weight
- BMI
- Pain (measured as ‘absent’, ‘mild’ or ‘severe’)
- Numerical continuous
- Categorical nominal
- Numerical discrete
- Categorical ordinal
- Categorical ordinal
- Numerical continuous
- Numerical continuous
- Categorical ordinal
What is quantitative data?
Numerical data. It is measurable data.
What is qualitative data?
Not numerical data
How do you graphically present categorical data?
Pie chart, Bar chart, Frequency distribution table
How do you graphically represent numerical data?
Histogram, Box and Whisker Plot
What are scatterplots used for?
To display relationships between numerical data (using tow continuous variables).
What does positively screw data look like on a histogram?
The bell-shaped distribution is shifted heavily to the right. Thinner ends are called tails
If one tail stretches out farther than the other, the histogram is skewed.
What does negatively screw data look like on a histogram?
The bell-shaped distribution is shifted heavily to the left. Thinner ends are called tails
If one tail stretches out farther than the other, the histogram is skewed.
What is the normal distribution?
A bell-shaped distribution that is symmetrical.
What is the explanatory variable?
The independent variable
What is the outcome variable?
The dependent variable
What type of descriptive statistics doe you use on categorical data?
Frequency, Proportion and Percentages
What type of descriptive statistics doe you use on numerical data?
Mode, Median, Range (and IQR), Standard Deviation
What is mean?
The average value. This is calculated by adding up the sum of the values and dividing this value by the total number of values.
What is the median?
Where the mid-point of the measurement values lies.
Defined as the value above and below which, half (50%) of the measurements lie.
To calculate the median:
- Sort observations in numerical order
- Find the mid point
- If two values lie at the mid point, average them
What is the mode?
The most common value.
What is the range?
The difference between the highest and lowest data value. This indicates the extreme within which all measurements lie.