Biostatistics Flashcards
What is statistics (using information and data)
Statistics is a body of techniques and tools used in the collection, organization, analysis, interpretation and presentation of information that can be stated numerically
It is the collection, presentation, analysis and interpretation of numerical data.
Explain the two types of statistics
Descriptive:
Describes the population. Summarizes measurements .
Involves: frequencies, proportions, measures of central tendency, measures of dispersion/variation.
e.g weight of final year medical students
Inferential:
Uses data from a sample to represent the population which the sample came from
e.g weight of final year medical students to represent weight of medical students as a whole
Describe types of variables
Quantitative/Numerical:
Are just numbers, whether whole/integers(Discrete) or fractions(continuous). Any thing that can be counted or measured.
Qualitative/Categorical
Describes data that fits into categories.
3 types: Binary, nominal and ordinal
meaning of observations
Any subject that serves as the data source e.g people, schools
meaning of variables
The thing that can be measured e.g blood pressure
meaning of values
The actual result gotten from measuring a variable e.g 130/75mmHg
how many people are at least 7 years old? Is this a qualitative or quantitative variable.
Quantitative because the people can be counted. Specifically discrete quantitative.
What are measures of central tendency/location
They are tools used to summarize entire quantitative datasets into the most likely value (basically like the average)
What are the 3 measures of central tendency
Mean
Median
Mode
mean
Simply the arithemetic average
m = ∑x/n
Mean for grouped data
Mean = ∑fx /n
Where f = frequency of each group or class
x = mean value of the group
n = number of observations
median
The mid value of a series of data
(n + 1)/2
Best for skewed data
Mode
Most frequently occurring observation in a series
4 common measures of variation/dispersion
Range
Interquartile range
Variance
Standard deviation
Range
This is the difference between the largest and smallest values.
For grouped data, it is the difference between the mid-points of the extreme categories
Interquartile range
This indicates the spread of the middle 50% of the data
IQR = upper quartile – lowest quartile
Upper quartile and lower quartile
Find the median of the entire number series.
Find the median of the lower and upper halves, these two numbers are the lower and upper quartile respectively
Variance
variance = (summation(x - m)squared)/(n-1)
Where x = each data point/value
m = mean
n = size of the sample
Standard deviation
aka root mean square deviation
It is the square root of the variance
Which method of dispersion to use for skewed data
range and IQR
Formula for obtaining the standard deivation of grouped data
square root of (∑x² - (∑x)² /n)/(n-1)
3 methods of data presentation
Text
Tables/charts
Graphs
When can data be used within a text
When there are only two data points being compared
Guidelines for drawing a good table
Should be able to stand on its own
Tables should be numbered in Arabic numerals in the order in which they appear (Table 1, 2 etc)
Title should be informative and written above the table
Better to remove grid lines
Use footnotes to explain abbreviations or symbols
When can data be used within a table
When the data is more complex
When can data be used within a graph
Useful when there are few data points or categories. Also to show trends
Guidelines for drawing a good table
Should have an informative title
Title should be written below the graph
Figures should be numbered in Arabic numerals according to the order in which they appear
Legends should be clear (a legend is that colour-coded thing that gives you more information about the graph)
Use strong contrasting colours
Guidelines when making a pie chart
Ensure the wedges all add up to 100%
Begin at 12 o’clock position
Go clockwise from largest to smallest
Show no more than 7 wedges
Use distinct colours for wedges
What is the major difference between histograms and bar charts
Bar charts are used to represent discrete data, that’s why there are spaces
Histograms are used to represent continuous data
Mention 7 types of graphs used for data presentation
Line graph
Stem leaf
Box and whisker plot
Frequency polygon
Histogram
Pie chart
Bar chart
What are scales of measurement
Ways in which variables are defined and categorized. It determines the type of statistical analysis that is done.
What are the 4 scales of measurement
Nominal
Ordinal
Interval
Ratio
Nominal scale
For unordered categorical data
Places people or objects in mutually exclusive categories
Eg. gender
Ordinal scale
For ordered categorical data
Ranks objects in order
Eg. Level of education
Interval scale
For discrete dataFor discrete data
Units of measurement are equal throughout the full range of the scale but has no ‘true zero’ point
Zero does not represent the absolute lowest value
Addition and subtraction operations can be performed
Eg. Measurement of temperature in degrees
Ratio scale
For continuous data
Has a true zero point (No numbers exist below zero)
Can calculate ratios between scale values
All 4 mathematical operations (addition, subtraction, multiplication and division) can be performed
Eg. Height, serum Calcium