Section 1 (Pgs 1-20) Flashcards
What are descriptive statistics?
Summary statistics that describe features of the data
What is inference?
The act of drawing a conclusion about a population based on a sample
What is data?
The raw material for data analysis
What are the 2 main types of data?
Quantitative
Qualitative
What is another name for qualitative data?
Categorical
what are the types of quantitative data?
Discrete
Continuous
Describe discrete data?
Whole numbers e.g. counts
Describe continuous data?
Continuous measurement can take any value, depending on the accuracy of the recording instrument e.g. height
What is qualitative data?
Data where individuals or objects are classified into groups e.g. obese, overweight, normal weight, underweight or different types of diets
What are the types of qualitative data?
Ordinal
Nominal
Binary (binomial)
What type of qualitative data has a relationship between the categories meaning they can be ordered?
Ordinal
What type of qualitative data has no relationship between the categories?
Nominal
Name of the ordinal scale representing the degree of agreement with a statement?
Lickert scale
What type of qualitative data is categorial with only 2 values?
Binary/ binomial
What type of data is eye colour?
Nominal
What type of data is severity of disease?
Ordinal
What type of data is number of alcohol units?
Discrete
What type of data is waist to hip ratio?
Continuous
What type of data is time in hospital?
Discrete
What type of data is mortality?
Binary
What is the normal structure of a table?
Row per case
Column per variable
What is it called when a variable is recorded several times e.g. at multiple clinic visits, often represented by multiple columns on a table?
Repeated variables
What should you always do to data before beginning a formal analysis?
Inspect it
What is a population?
All the members of the particular group under study
What 2 properties should a sample have?
Large enough to detect any differences that are of interest
Representative of the population
What do we have to be careful that a sample for a research study is not?
Biased
What are the 2 most common ways to prevent biased?
Random sampling
Stratified random sampling
What is a random sample?
One in which each member of the population has an equally likely, non-zero chance of being included
When is a stratified sample taken?
When there are categories in the population that must be represented
What non-ideal method of sampling is often done?
Convenience sampling
What is convenience sampling?
Sample is not chosen randomly but is all that is available e.g. those that present at a clinic
When using convenience sampling, what must be done?
Collection of full information to investigate for possible bias at the analysis stage
Sampling method used for a diabetiologist’s research study of the long term effects of T2DM in his patients?
Convenience sample
Sampling method used for a government investigation into the proportion of children born with CP in the last year (no central register)?
Random sample
Types of sampling method?
Random
Stratified
Convenience
What are the 7 main ways of presenting data?
Tables Bar charts Pie charts Histograms Stem and leaf plots Box and whisker plots Scatter plots
What 2 ways are used to present qualitative data?
Bar charts (can also be used to represent discrete data) Pie charts
What 4 ways are used to present quantitative (usually continuous) data?
Histograms
Stem and leaf plots
Box and whisker plots
Scatter plots
What do scatter plots do?
Display the relationship between 2, usually continuous, variables
What is the purpose of organising data into tables?
Helps to identify errors, trends and special cases
What type of table are categorical variables often summarised in?
A contingency table
What does a contingency table do?
Gives the number or percentage in each category (if giving percentage, must also give the count and the denominator as a percentage with no other information can be misleading)
If making a 3D pie chart, what should the volume represent?
The proportion
What is the frequency distribution function?
The relationship between the data values and their frequencies
How to calculate the relative frequency?
Subgroup count divided by total count
What does the height of each column of a histogram represent?
Frequency
What does the wide of a column on a histogram represent?
Grouping interval
What is the areas of the columns of a histogram proportional to?
The frequencies in each group
What is a relative frequency histogram?
One in which the height of the column is labelled with the relative frequency
What is the total area of columns in a relative frequency histogram?
Why?
1
Corresponds to the probability of a subject chosen at random having a height in any of the classes. This is a certainty -> probability therefore = 1
How is a frequency polygon constructed?
By joining the midpoints at the top of each column of the histograms
What does the probability density function provide information about?
Probability of each variable
How is a probability density function represented graphically?
By a frequency polygon in which the vertical axis corresponds to relative frequency
What are the advantages of a stem and leaf diagram over a histogram? (2)
It is easier to construct
The individual values of the data are shown
What do the whiskers on a box and whisker plot represent?
The minimum and maximum value (together form the range)
How much of the data does the box in a box and whisker plot contain?
50%
What does the upper and lower whiskers of a box and whisker plot sometimes represent?
The values above/below which 2.5% of values lie
What does a box-and-whisker plot provide information about?
The symmetry and variability fo the distribution of the data
What can be calculated form a scatter plot to assess the strength of the linear relationship?
Coefficient of correlation
What is the name of the line on a scatter plot?
Regression line