RDS statistics Flashcards
what is social accountability
re-orientate teaching and the research towards addressing the needs of the community that they’re based in.
what are the components of social accountability
involve patients, community members
who is benefiting from your research
addressing priority health and social needs
sustainability and best use of resources
define hypothesis
A prediction of what the answer to the research question may be
(stated at the outset of a piece of work)
types of hypothesis
Null hypothesis Alternative hypothesis (or several alternative hypotheses)
what is null hypothesis
states that there is no dependent relationship between two variables
e.g. Staying up all night before an exam will not affect exam performance
what is alternative hypothesis
predicts a specific and reproducible relationship between variables
e.g. Staying up all night before an exam will reduce exam performance
how is the hypothesis analysed
A statistical test is used to determine whether we can reject – or fail to reject the hypothesis.
what is the p value
likelihood that the observed difference (or something more extreme) was observed by chance.
what is the alpha value
is the probability of rejecting a null hypothesis when it is true.
For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference. That is to say rejecting the hypothesis when it should not be rejected.
when is there a significant difference
Most authors would consider a p-value to indicate a significant difference when the probability is less than a = 0.05 (less than one in a twenty chance of being wrong)
A small p-value (p ≤ 0.05) indicates…..
strong evidence against the null hypothesis, so you reject the null hypothesis.
A large p-value (p > 0.05) indicates…..
weak evidence against the null hypothesis, so you fail to reject the null hypothesis.
If the p-value is equal to the a-value
we reject the null hypothesis
critical appraisal
critical appraisal
give an example of primary and secondary research outcomes with regards to Sipuleucel-T Immunotherapy for Castration-Resistant Prostate Cancer
primary outcome: effect of sipuleucel-T on overall survival among men with castration-resistant prostate cancer.
Secondary outcome: the effect of sipuleucel-T disease progression was observed.
which statistical tests can be used to assess for differences
e.g. “there is a difference exists between male and female systolic blood pressure”
Chi squared test ANOVA T-tests Kruskal-Wallis Wilcoxon Mann-Whitney U-test
which statistical tests can be used to assess for similarities (links)
e.g. taller people have larger feet
Chi squared test
Pearson
Spearman Rank
define quantitative data
numerical information about quantities
Magnitude, occurrence, association
quantitative data can be subdivided into:
Information that can be measured and have continuous dimensions
Information that can be counted but are not continuous
define Qualitative data
information about qualities; that is information that can’t actually be measured.
deals with descriptive information
Who? What? Why? When? How?
define and give examples of continuous data
data can be divided and reduced to finer and finer levels (measured)
height, temperature, blood pressure etc
define and give examples of discrete data
a count of that cannot be made more precise (counted)
number of children in a family or the number of patients in a clinic
define Categorical data
in-between quantitative and qualitative data because the ordinal aspects can be easily converted into numerical data.
give an example of categorical data
For example a scale on happiness can be given in numbers instead of words. Whereas nominal categorical data is more like qualitative data but the data consists of individual terms rather than sentences in qualitative data.
There are some variables that could be measured quantitatively or qualitatively such as….
eye colour can be measured quantitatively by assessing the RGB scale or qualitatively by categorising into blue, brown or green etc.
Categorical data can also be subdivided into two types:
nominal and ordinal
what is nominal data
items that are assigned individual named categories that do not have an implicit value .
e.g. gender (male or female) or fracture incidence (yes or no).
what is Ordinal data
items which are assigned to categories that do have some kind of implicit order, such as ‘small, medium, or large’
often used to describe a patient’s characteristics e.g. stage of hypertension, pain level, and satisfaction.
what does normality measure
the central tendency and dispersion of data and is used to decide how to describe the properties of large data-sets
normal distribution is often described as….
‘bell curved’ or ‘Gaussian’
what is skewed data
data which is a-symmetric with many data points in the high or low end of the range and an uneven tail (long on one side and short on the other)
describe left skew
long left tail (refereed to as -ve skew)
The mean and median are also to the left of the peak
describe right skew
long right tail ( referred to as +ve skew)
The mean and median are also to the right of the peak
what is kurtosis
data that are heavy-tailed or light-tailed relative to a normal distribution
Data sets with high kurtosis tend to have….
heavy tails, or outliers that create a very wide distribution
Data sets with low kurtosis tend to have…..
light tails, or lack of outliers that create a very narrow distribution.
Normality can be visually assessed by…
evaluating a frequency bar-chart or histogram
The statistical tests of normality are:
Shapiro-Wilks test: used to test for normality with small sample sizes (n<50)
Kolmogorov-Smirnov: used to test for normality with large sample sizes (n>50)
how is the p value used to asses normality
p-value <0.05 is considered to indicate a violation of normality
median equation
(n + 1)/ 2 = what value in the list to look for
define range
difference between largest data value and smallest data value
define Variance
a measure of the spread of the numbers away from the mean value.
It is calculated by working out the average of the squared differences from the mean
define Standard deviation
square root of the variance. Measures the spread of a set of data.
define Interquartile range
calculated by subtracting the value of the lower quartile from the value of the upper quartile
what are paired observations (dependent)
arise from measuring the same variable in the same subject at different time-points
longitudinal study
what are unpaired observations (independent)
seen when comparing two groups with no common factors
cross sectional study
give examples of parametric data sets
t-test (2 groups paired/unpaired)
ANOVA (3 or more groups unpaired/paired)
when is parametric statistics used
quantitative data which is normally distributed.
give examples of Non-parametric tests
Mann-Whitney (2 groups, unpaired)
Wilcoxon signed rank test (2 groups, paired)
when are non parametric tests used
when quantitative data is not normally distributed.
which test type is better parametric or non parametric
Parametric tests are easier to understand, the analyses are more powerful and they are less likely the incorrectly reject or fail to reject a hypothesis.
types of ANOVA test
paired = Repeated-measures, one-way ANOVA
unpaired = One-way ANOVA
what are 2 non-parametric equivalents of the one-way ANOVA
Kruskal-Wallis test ( groups or more, unpaired)
Friedman test (3 groups or more, paired)
if data is normally distributed and you want to assess correlation what test would you use
Pearson’s correlation
if data is NOT normally distributed and you want to assess correlation what test would you use
Spearman rank correlation
A Pearson correlation coefficient will give you an r-value, which tells you….
how strong the relationship is. It will vary between -1 (represents a perfect negative correlation) and +1 (represents perfect positive correlation).
(±) 0-0.2: very low correlation (±) 0.2-0.4: low correlation (±) 0.4-0.6: reasonable correlation (±) 0.6-0.8: high correlation (±) 0.8-1.0: very high correlation
A Pearson correlation may also give you a p-value. The p-value in this case tells you…
how reliable the r-value is. The smaller the p-value, the more reliable the r-value.
The r2-value can also be reported from a Pearson correlation. This represents……….
how closely your data is fitted to the correlation line
what are r and p equivalents of spearman’s test
A correlation coefficient (Spearman’s rho, denoted by ρ) is the equivalent of the Pearson r-value.
The p-value, once again, tells you how reliable the rho-value is.
define correlation
indicates the strength of the relationship between two variables
define Regression
quantifies the association between the two variables i.e. it tells us the impact that changing one variable will have on the other variable.
regression equation
It is defined by a simple equation: y = a + bx
Where:
a= the y-axis intercept value
b= the gradient of the the line, i.e. the regression coefficient
When to use the chi square test
used when you counted quantitative data and you want to see if there is an association/difference between your data-sets
look at chi squared example
chi squared
when to use pie chart
used to show the relative sizes of the parts of whole
when to use bar chart
representing comparative data when one of the comparators is categorical (nominal or ordinal) and the other is numerical
difference between histogram and bar chart
a histogram can have continuous numerical data on both of the axes but a bar-chart can only have continuous numerical data on one of the axes
what is an advantage of Dot-plot over bar chart
provides a better visual representation of the data dispersion but it is more suitable for smaller data-sets.
what does Box & whiskers plot show
5-number summary of a data-set
lower extreme lower quartile median upper quartile upper extreme
what kind of data sets are box and whiskers plots used for
used to summarise a single data-set
primarily used for non-parametric numerical data
what is scatter plot used for
A scatter-plot is used to identify similarities/ relationship between two data-sets
look at Other types of graph
Other types of graph
what graphs can be used to determine normal distributions
histograms or a box & whiskers plots ONLY
examples of qualitative research
Individual interviews
Focus groups
Observations take place in natural settings and involve the researcher either videoing, photographing and/or taking lengthy and descriptive notes
Action research (or participatory action research)
examples of Quantitative research
Surveys or questionnaires, observation (counting the number of times a specific phenomenon occurs), Document screening, Experiments
what are the 6 key steps of qualitative research
Familiarisation
Focus
Code (categorise/organise data)
Mapping (looks for similarities, contradictions of codes)
Themes (studying the commonalities, contrasts, importance)
Interpret and explain
appraising qualitative research
Was the sample used in the study appropriate to its research question?
Were the data collected appropriately?
Were the data analysed appropriately?
Can I transfer the results of this study to my own setting?
Does the study adequately address potential ethical issues?
Overall: is what the researchers did clear?