HRSS Flashcards
What type of scale is IQ score?
Interval scale
What time of scale is time?
ration scale
What type of scale is weight?
ratio scale
what are potential threats to the internal validity of a research project?
- the presence of confounding variables
- attrition
In a meta analysis it is possible to combine data if studies are homogenous in terms of the research question, methods, treatment, and outcome measures. True or False?
True
In systematic reviews publication bias refers to….
negative studies are less likely to be published then positive studies
what is a variable?
a variable is a measurable quantity which can assume any of a prescribed set of values.
definition: quantitative variables…
takes numeric values
- discrete variable (assumes only isolated values)
- continuous variable (it can take any value within a given range)
Definition: qualitative variables
not measured numerically (categorical)
- nominal (just labels, no natural order)
- ordinal (some natural order)
Type 1 Error
error of rejecting H0 when the H0 is true
Type 2 error
error of accepting the H0 hen the H0 is false
Definition of P-value
P-value = the probability of obtaining the present test result if the null hypothesis is true
- calculated to see id the results occurred by chance.
if the p-value is small < 0.05 = statistically significant
non-parametric tests
compare medians or mean ranks rather than means
- suitable for non normal data
- when the distribution is unknown with a small sample size.
Quantitative Research design
7 distinct characteristics
- attempt to verify theory and be deductive
- has a predetermined design structure
- uses data derived from score on standard scales
- uses probability sampling
- maintains to independent role of the researcher
- clearly defined structure
- employs statistical analysis
definition: dependent variable
measures for any change resulting from manipulation (eg treatment by the experimenter)
(outcome, response)
definition: independent variable
is the variable that is manipulated by the experimenter
explanatory, predictor
definition: discrete data
takes only a finite number eg number of children
definition: continuous data
all possible values within a given range
eg weight, hight
nominal scale
eg blood type or gender
ordinal scale
observations that are ranked
eg severity ratings
interval scale
equal units of measurement (no absolute zero)
eg IQ, temperature
Ratio scale
like interval but with absolute zero
eg height, weight
Types of probability sampling
- random
- systematic sampling
- stratified random sampling
- disproportional sampling
- cluster sampling
Random Sampling
(probability sampling)
each member of the population has equal chance of selection
- randomly selected from population and randomly assigned into experimental/control group
Systematic Sampling
(probability sampling)
- every nth record selected from a list
Stratified Random Sampling
(probability sampling) - considered superior to random sampling - bases on characteristics pop= 300 - 200F and 100M sample = 60 - 40 F and 20 M
not always relevant
Disproportional Sampling
(probability sampling)
used when unequal in size, causing inadequate same size for comparison
eg if 90% female 10% male
still pick 10 female and 10 male
Cluster sampling
(probability sampling)
- sampling from a series of random units in a population
eg pick states, 10 hospitals in each, 2 physios from each hospital …
Types of Non-probability sampling
- convenience
- quota
- purposive
- snowball sampling
Convenience Sampling
(nonprobability sampling)
- subjects chosen on the basis of their availability
Quota Sampling
(non-probability sampling)
- researcher guids sampling process until the participant quota is met
Purposive Sampling
(non-probability sampling)
- hand picked participants based on criteria
Snowball Sampling
(non-probability sampling)
- used when desired characteristics are rare
- relies on original participants identifying or referring other people
experimental study designs
true experimental ie RCT Quasi experimental (no random assignment to groups)
- experimenter has some control over independent variables
- experimental conditions constructed
- involve some form of treatment/intervention
- aim: IV is the cause of changes in DV by controlling for other possible influences
Observational study designs
- observed in their normal state
- groups that are compared are self selected
- subjects may be measured and tested but there is no treatment or intervention
- can be presepective or retrospective
Types of RCT’s
- Parallel groups (2 groups are studied concurrently)
- Cross over design (order in tx are given randomly with wash out periods)
- Within group comparison (Tx investigated concurrently in same pt, used for Tx that can be given independently to different body parts)
- sequential design ( parallel groups, trial continues until clear benefit of one Tx or until clear of no benefit)
- Factorial design (several factors compared at the same time)
internal validity
for a study to have it, it must clearly demonstrate that a specific intervention manipulation causes an effect.
ie - the effect found is not due to some other factor
threats - chance, bias, confounding
External validity
how applicable the results are to the target population from which the sample was drawn
Literature Reviews
can describe previous work, can be a mixture of evidence and opinions
Systematic reviews
Definition: a formal identification, assessment and synthesis of all primary research evidence to answer a specific research question using reproducible methods
Meta-analyses
quantitative summary of results of several studies
Why do systematic reviews?
- pools large amounts of information from multiple individual studies
- clarifies the status of existing research to inform decision making
Advantages and Disadvantages of Systematic reviews
Advantages:
- improves the ability to study consistency of results and findings
- when conflicting results SRs provide overall picture of what is happening
- if sample sizes are small, some SRs can pool data, increasing power to detect effects
Disadvantages:
- improved power to detect effects can also magnify effects of bias
Steps in conducting a systematic review
- Define the research question
- Create plan
- Search the literature for potential eligible studies
- Apply eligibility criteria to select studies
- Assess the risk of bias of selected studies
- Extract data from the selected studies
- Synthesise the data
- Interpret and report the results
- Update and review in the future
Two types of synthesis
- Meta-analysis - statistical synthesis of data from individual studies (depends on hetro/homogeneity)
- Narrative synthesis - synthesis of key information from individual studies using descriptive narrative rather than statistics
Dichotmous Outcomes: Odds Ratio
A ratio of odds of the event occuring. Calculated by dividing the odds (of a specific event) in the treated group by the odds of that event in the control group
Odd Ratios measure strength of association between variables
OR = 1 suggests no association between the variables under study
A higher OR away from 1 represents a strong association b/w the variables
An association is not significant if the confidence interval for the OR contains a 1
Reliability
reliability in measurement refers to the consistent accuracy of measurement
a reliable measure should give the same answer when used to measure the same variable, in the same manner, in the same subject, time after time.
Inter-rater reliability
agreement between two or more raters
Intra- rater reliability
consistency of ratings of 1 rater
Test-retest definition and considerations
- consistency of test score after a predetermined period considerations: - learning effects - carry over effects - fluctuating characteristics - environmental variable - time of day - motivation level
validity
in measurement refers to the extent that the tool measures what it is intended to measure for a specific purpose 4 types: face validity content validity concurrent validity construct validity
Face Validity
- validity may be obvious if the measured characteristics are concrete
- e.g weight, height, range of motion
- must be able to directly observe
Content Validity
- is a measure of how well an instrument measures the content of a particular trait or body of knowledge
- usually addressed by a panel of experts
- determine the universe of items related to the construct and select an adequate sample of items for the test
Concurrent Validity
- uses another measuring instrument (with known validity) as a criterion to assess whether the new instrument is measuring what it is meant to
- the two measurements are taken at the same time
- usually a ‘gold standard’ is used
Construct Validity
- aims not only to validate the test but also the theory behind the test
- not just reliant on a panel of experts (content) but also includes hypothesis testing
Other validity considerations…
- sensitivity
- specificity
- positive predictive value
- negative predictive value
- likelihood ratio
sensitivity
- measure the proportion of actual positives which are correctly identified.
- amount of positives the test identifies TP/ TP + FN
specificity
- measures the proportion of actual negatives which are correctly identified
- how many negatives identified by the test
TN/TN + FP
Positive Predictive Value
measures the proportion of participants with a positive result who are correctly diagnosed
TP/ TP+ FP
(predictive values depend of the prevalence of the disease)
Negative predictive value
- measure the proportion of participants with a negative result who were correctly diagnosed
TN/TN +FN
(predictive values depend of the prevalence of the disease)
we use these to predict the chances of someone having the diagnosis
Likelihood Ratio
is the ratio of the probability, of the specific test result, in people who do have the disease to the probability in people who do not
- there are positive and negative likelihood rations
- independent of disease prevalence and do not vary in different populations or settings.
- can find to probability of disease for an individual patient
- LRs measure the power of a test to change the pre-test into the post-test probability of a disease being present
- the further the LRs are from 1 the stronger the evidence is for the presence or absence of the disease
- LR >1 = the result associated with the presence of the disease
- LR< 1 = the result associates with the absence of the disease
LRs summarise how much more (or less) likely pts with the disease are to have a particular test result than pts without the disease
Reproducibility, repeatability, Reliability - all mean that the results of a particular test or measure are….
identical or closely similar each time it is administered
Reliability of clinicians ratings can be assessed using
- Kappa (Cohens) if ratings are on a nominal scale
- Wieghted Kappa if ratings are in an ordinal scale
variations may arise because of…
variations in procedures, observers or changing conditions of test subjects a test may not consistently provide the same results when repeated.
intra-subject
intra - observer
inter- observer
Reliability in categorical rating
when ratings come from a nominal (eg present, absent) or ordinal (eg none, mild, moderate, severe) scale, subjects are assigned to a code that classifies them as belonging to a particular category - here reliability is measure as the extent of agreement
- Percent Agreement
- Kappa
Percent Agreement
- simplest way to estimate reliability
- calculate the proportion of observed agreement
- it doesnt take into account agreement by chance
Bland-Altman Analysis
graphical representation of the differences between the two methods against their averages and computes the 95% limits of agreement
- useful to look for any systematic bias, possible outliers, or any relationship between the differences between measures
Reliability in Continuous Ratings
reliability is the ratio of variability of the true score to the total (true+error) scroe
the less measurement error, the more reliable the measurement.
- Intraclass correlation (ICC) _Standard Error of Measurement (SEM)
Intraclass Correlation (ICC)
for continuous rating, assesses rating reliability by computing the ratio of the between-subject variability and the total variability
- within subject variance represents measurement errors
- ICCs are highly dependent upon sample heterogeneity; the greater the magnitude of the ICC
- The same instrument can be judged ‘reliable’ or ‘unreliable’ depending on the population in which it is assessed.
- help practitioners to know whether the instrument is able to discriminate between patients in the sample
- ICCs are highly dependent on the variation in the sample
- ICC for analysis of reliability in not sufficient and should be complemented by other statistics eg BA, SEMS
Interpretation of ICCs
- values range from 0-1
- ICC approaches 1 when the within subject variability comes close to 0
- ICC provides small values when within subject variability is much greater than the between subject variability
- high value ICC doesn’t always mean high reliability
ICC models
1, each subject is assessed by a different set of randomly selected raters ICC 1,1
2, each subject is assessed by each rater and the raters are randomly selected ICC2, 1
- if aim is general application in clinical practice
3, each subject is assessed by each rater, but the raters are the only raters of interest ICC 3, 1
- If testing by only small number of raters
(no generalizability)
unlikely to use ICC1,1 in clinical practice
Standard Error of Measurement (SEM)
- a measure of discrepancy between scores and is expressed in the same unit as the original measurement
- SEM quantifies the precision of individual scores on a test and thereby indication whether a change in a score is a real change or due to measurement error
- the smaller the SEM the more reliable the measurements
- SEM% can be used to compare tests that use different units of measurement
- SEM is affected by sample heterogenity
Smallest Real Difference
a way to evaluate clinically important changes
- a low SRD indicates adequate sensitivity to detect a real change
- SRD is crucial in determining whether the magnitude of the effect is clinically important.
- when the change is greater than SRD, the change is considered to be true.
- A change must be larger than the size of measurement error before being considered as important or meaningful
Power calculations
can be done to find out how many participants would be needed to have a good chance of detecting a significance
information needed to work out no of patients needed….
- significance level of statistical significance (chosen 0.05)
- Power = probability of correctly accepting the alternative hypothesis when this is true.
- minimal clinically relevant effect size for each type of outcome measure
- Standard Deviation of outcome measure
- drop out rate
what would you need to review when critically appraising a paper?
- were there specific inclusion and exclusion criteria
- were the experimental and control groups similar in key outcome measures and characteristics at baseline
- what was the dropout rate for each group
logistic regression is appropriate when the dependent (outcome) variable is….
categorical
smallest real difference (SRD) is crucial in determining whether…
the magnitude of effect is clinically important
what is important in understanding the external validity of a clinical trial?
- the patient characteristics, such as age, sex, as well as the type of disability, illness or injury
- the intervention (what it is, as well as how and where and by whom it was delivered) can be preformed in a usual practice setting
- the outcome measures reflect what is clinically relevant and are measured over a timeframe that is also clinically appropriate.
- the elimination or reduction of the possibility of bias within the trial design
assumptions related to logistic regression…
- linearity in the logit
- independence of errors
- reasonable ratio of cases to variables
Cohen’s Kappa
calculates agreement between observers over and above what might be expected by chance alone
- also called chance corrected agreement
Kappa (Cohens) if ratings are on a nominal scale