Stat Flashcards

1
Q

What is Berkson bias

A

admission rate bias results from a difference in the rates of admission of cases and controls due to the influence of the exposure. E.g. In a case-control study of smoking and dementia, the association will tend to be weaker (or even absent) if controls are selected from a hospital population (because smoking causes many diseases resulting in hospitalization) rather than from the community.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Neyman bias

A

incidence-prevalence bias. While ascertaining causation, one must look for an association between a risk factor and incidence – not prevalence. If a case- control study evaluates a risk factor that makes a person die quickly, then this will be underrepresented if ‘prevalent cases’ are studied instead of ‘incident cases’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Response bias

A

when persons who respond to an invitation to participate in a study differ systematically from those who do not respond. The ‘healthy volunteers’ are often healthier than the general population. This is particularly relevant when evaluating screening tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Lead time bias

A

Lead-time is defined as the difference in time between the date of diagnosis with screening and the date of diagnosis without screening. Unless lead time is accounted for, survival time should not be compared to an unscreened control group of patients. Otherwise, the increase in survival time due solely to the advanced date of diagnosis will result in lead-time bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Diagnostic purity bias

A

Diagnostic purity bias refers to the exclusion of comorbidities resulting in a non-representative sample, especially problematic in RCTs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Types of measurement bias (6-8)

A

a. Recall bias: Subjects often recall risk factors differently depending on their disease status. Case-control studies are particularly vulnerable to this type of bias.
b. Reporting bias results when a larger percentage of either case or control subjects are reluctant to report an exposure due to attitudes, perceptions or other concerns
c. Observer bias can occur whenever a researcher either knowingly or unknowingly evaluates a variable depending on the status of the individual under study. For e.g. when the research observer knows that a subject is on placebo, he may rate him higher on depression in a trial.
d. Surveillance bias: Disease may be better ascertained in a monitored population than in the general population
e. Work up bias (verification bias): During assessment of validity of a diagnostic test, the execution of the gold standard test may be influenced by the results of the assessed new instrument; e.g. the reference test may be less frequently performed when the test result is negative.
f. Misclassification bias: In extreme cases measurement bias may lead to misclassification. Cases may be misclassified as controls or ‘exposed group’ may be misclassified as ‘non- exposed’. Such misclassification amounts to bias only if it is differential i.e. one-sided. Errors in measurement instruments may lead to non-differential misclassification (both sides are affected equally), which often leads to a reduction in the observed magnitude of association rather than producing biased results.
g. Desirability bias – patients may choose socially desirable answers to provide during data collection, distorting the true picture. This leads to reporting bias.
h. Hawthorne effect refers to observed respondents minimizing perceived deviation from the norm. Occurs especially in cross- sectional surveys using questionnaires.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are used for fixed effect statistics

A

Mantel-haenszel and Peto ratios

MH: useful even when Wie diff exist between individual studies in ratios of the size of two groups

Peto use for for RCT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What s a fixed effect analysis

A

Inference is restricted o include set of studies, assumes only random error with in studies could explain observed differences
Ignored between study variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Random effects analysis

A

Each study shows a diff effect which are normally distributed around true mean
Assumption gives proportionally greater weight which are normally distributed around the true mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to calculate heterogeneity

A

Q stats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Heterogeneity can b judged graphically via

A

Forest Plot & L’Abbé plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Cochran’s q

A

calculated as the weighted sum of squared differences between individual
study effects and the pooled effect across studies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to detect publication bias

A

Funnel plot

Fail safe n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Wad is a blobbogram

A

Forest plot

presents the effect (point estimate) from each individual study as a blob or square (the measured effect), with a horizontal line (usually the 95% confidence interval, indicating the precision) across the blob.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Retrospective study advantages

A

It is mostly useful to study outcomes which are rare

It is mostly useful in diseases where exposure is common

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Work up bias (verification bias)

A

systematic error in the assessment of the validity of a diagnostic test. When the execution of the gold standard is influenced by the results of the assessed test, especially when the reference test is less frequently performed when the assessed test result is negative, then this will influence the number of false negatives correctly identified in the exercise. This bias is specific to assessment of diagnostic tests and so not seen in ordinary case-control studies where cases and controls are determined before estimates of exposure begin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Relative risk is a

Attributable risk

A

Ratio, it can have values less than 1, but not less than 0.

Attributable risk is an absolute risk difference it can be less than zero when the risk in exposed is less than the risk in non-exposed. Both attributable risk and relative risk are measures of differences between groups - while the former is an absolute difference, the latter is a ratio. The odds ratio is a cross produc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

minimisation schemes

A

next allocation depends on characteristics of those already allocated. Allocation of each participant aims to ensure a balance of prognostic factors between groups. The disadvantage is that this method is inferior to proper randomisation as it allocation is somewhat exposed and ‘controlled’ manually.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Hills criteria for causality

A

consistency, specificity, temporality and biological gradient (dose-response relationship). Consistency refers to the association being repeatedly observed in studies performed by different persons, in different settings, among different populations and using different methods. If a specific exposure can be isolated from others and associated with a specific disease, then such specificity supports causality. This is perhaps the most difficult criterion to fulfill because in practice many exposures (e.g., cigarettes or radiation) are associated with multiple effects and specific diseases often have more than one cause. Temporality refers to time relationship between cause and effect: the factor believed to have caused the disease must have occurred prior to disease development.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which of the following can be used to demonstrate the validity of a qualitative study?

A

The degree of reflexivity in a qualitative study is used as a method of assessing the validity of the study. Other methods include triangulation, respondent checking and deviant case analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Wad is Ethnography

A

involves immersing oneself in a particular social group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Reflexivity

A

process of “benign introspection” which involves thinking about how the researcher’s own experiences may have influenced the data collection and interpretation in a qualitative study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Weighting

A

significance attached to each study based on sample size, precision, external validity (the extent to which results are generalisable) and methodological quality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

In which of the following situations a random effects analysis is indicated in a meta-analysis?

A

A fixed effects model assumes that all the studies share the same common treatment effect while a random effects model assumes that they do not share the same common treatment effect. In fixed effect analysis the inference is restricted to included set of studies. It assumes that only random error within studies could explain observed differences. It ignores between-study variations (hence heterogeneity). So this can be applied only if heterogeneity can be safely excluded by testing for it. Random effects analysis assumes that each study shows a different effect which are normally distributed around true mean. This assumption gives proportionally greater weight to smaller studies. Hence this model is susceptible to publication bias and results in wider less precise confidence intervals.

The correct answer is: Presence of statistical heterogeneity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Who coin the meta analysis

A

Gene glass

1976

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

effect size measured in meta-analyses?

A

Difference in outcome between intervention and control groups in terms of standard deviation of the outcome in the population

27
Q

Cost minimisation

A

used when both interventions produce same outcome; hence just the costs are compared

28
Q

Cost benefits

A

When two different interventions with different clinical outcomes are compared wherein all outcomes are translated into monetary (£s) terms for comparison, then this study is called cost-benefit study

29
Q

Cost utility

A

Cost-utility study is used when instead of clinical effect (measured using scales, questionnaires, bed occupancy etc) as .an outcome one holistically considers the disease burden, side effects and social impact etc., Here the outcome is measured in QALYs or DALYs.

30
Q

sensitivity analyses

A

This sensitivity analysis can be done in various ways such as one-way, worst case scenario, Monte-Carlo, bootstrapping,

31
Q

Standard gamble

A

method of establishing the utility of a specified health state. For chronic health states, people are asked to choose between the certainty of the specified health state for a given period of time or a gamble that involves a probability (p) of restoration to full health and a complementary probability (1-p) of immediate death.

32
Q

Pareto chart

A

the bars are reorganised to show the categories with most frequent showing events on the left and the least frequent to the right. Sometimes a line showing cumulative frequency is also superimposed. A Pareto chart helps us to visualise the high-yield events, which when focussed upon, can result in maximum improvement. (This charting follows the 80/20 economic principle. 80% of the income of a nation generally goes only to 20% of the population. By focusing on the 20% of high-impact circumstances, nearly 80% of problems can be managed).

33
Q

Which of the following is a consensus checklist recommended when reporting formal studies of healthcare quality improvement?

A

The Standards for Quality Improvement Reporting Excellence (SQUIRE statement) consists of a 19 items checklist recommended when reporting formal studies of healthcare quality improvement (Ogrinc et al., 2008).

34
Q

Lean Thinking

A

management philosophy focussed on 4 features (1) preserving value by identifying the value stream (2) reducing resource consumption by enabling process and value flow (3) reducing waste and developing pull systems (4) improving overall user satisfaction by pursuing perfection. Several retail and customer service industries have adopted this philosophy successfully in times of financial constraints.

35
Q

Variance

A

Variance = Sum of squared differences of individual observations from mean / (number of observations-1)

36
Q

Standard deviation

A

Square root of variance

37
Q

Coefficient of variation

A

coefficient of variation is obtained by dividing the standard deviation by the mean and expressing this as a percentage. It is a measure of the ‘relative’ spread of the data.

38
Q

standard error of the mean SE

A

standard error of the mean SE is the standard deviation divided by square root of sample size. So larger sample provides lesser SE and vice versa.

39
Q

Standard normal distribution

A

Standard Normal Distribution refers to a normal distribution whose mean is zero, and SD is 1 unit. Standard Normal deviate is an expression denoted by z.

40
Q

Wad is a positive skew

Where is most data lies
Where is mean, median ,mode
Where is the tail of distribution

A

https://www.google.com.sg/search?q=positively+skewed+distribution&rlz=1C9BKJA_enMY707JP708&oq=positively+skewed+&aqs=chrome.1.69i57j0l3.10322j1j4&hl=en-GB&sourceid=chrome-mobile&ie=UTF-8#imgrc=hbuli7yJOt6nZM:

positively skewed distribution, most of the data will fall to the left of the mean, while the “tail” of the distribution will be on the right. Also, the mean is to the right of the median, and the mode is to the left of the median.

41
Q

Which of the following measures of central tendency is the most sensitive to change to any of the individual values in a data set?

Central value

A

Mean

42
Q

Wad is variance

A

Square of standard deviation

43
Q

The coefficient of variation is for wad

A

The coefficient of variation enables comparison of variations of two (or more) different variables. The coefficient of variation is truly meaningful only for values measured on a ratio scale.

The coefficient of variation is defined as the sample standard deviation divided by the sample mean of the data set.

44
Q

Wad is Central Limit Theorem

A

This is called normal or Gaussian distribution. It is often assumed that larger the sample, more normal the distribution of its means (central limit theorem). Central Limit Theorem states that for sample sizes sufficiently large the means will be normally distributed regardless of the shape of the original distribution.

45
Q

As the sample size increases, the confidence interval for the population mean will

A

Decrease

46
Q

In a trial of a novel peripheral acting drug for symptoms of anticipatory anxiety in panic disorder/agoraphobia the sample size required was calculated to be 300 patients for RCT. Unfortunately, due to managerial changes in the trial administration board the trial could include only 175 patients. Which error will b worsen

A

Imagine that you are proposing that a species of blue rabbit exists in England. Suppose the truth is that there are no blue rabbits. Type 1 error refers to the situation where you get hold of a dirty rabbit and claim that it is blue; i.e. a ‘false positive result’. The chance of type 1 error is not directly dependent on sample size. You can examine just three rabbits in a bush and claim one is blue. In contrast type 2 error refers to not finding an effect that is present. Suppose the truth is there are few blue rabbits in Shropshire. This is possible if one does not examine enough specimens of rabbits, one may conclude that a blue rabbit does not exist i.e. a ‘false negative result’. So type 2 error depends on sample size. BMJ 2008;337:a2957

47
Q

Wad happens when increase sample

A

When the sample size is increased, precision will increase, the standard error will reduce. There will be no effect on a bias (systematic error) and mostly, no effect on standard deviation.

48
Q

correct pair of statistical measure and value of no difference with respect to confidence intervals:

A

With respect to confidence intervals of ratios, ‘one’ is the value of no difference. For confidence interval of differences in mean, it is ‘zero’. For inverse ratios such as NNT, it is infinity

49
Q

Multiple regression is used when

A

have one dependent (Y) variable and many independent (X) variables. The purpose of multiple regression is to find an equation that best predicts the Y variable as a linear function of the X variables. It is a multivariate test that requires parametric assumptions to be satisfied.

50
Q

Correlation regression

Name 3 facts.

A

Correlation is a prerequisite for regression. If A and B are not correlated, then we cannot predict A from B or vice versa. Regression is used to predict A from B. Correlation can be positive or negative to produce regression. The values range from -1 to +1. Pearson correlation is used for parametric data while Spearman’s is used for non-parametric data.

51
Q

Wad is the perfect kappa value while testing agreement between two observers for a categorical measure?

A

1

52
Q

Which one of the following is a measure of internal consistency of a test?

A

Split-half reliability is a measure of internal consistency reliability. The items on the scale are divided into two halves, and the resulting half scores are correlated in reliability analysis. High correlations between the halves indicate consistency in reliability analysis.

53
Q

Math : The degrees of freedom to use chi-square statistics i

A

for chi square, df = (number of rows-1) X (number of columns-1)

54
Q

Wad is a Cox proportional hazard test

A

statistical technique for exploring the relationship between the survival (time to event function) and several explanatory variables (including the predictors and covariates). This can help to test and find out the variable that has the most important impact on time to the discharge of a patient.

55
Q

Wad test for paired data

A

Wilcoxon rank sum, sign and paired t tests. If the differences are Normally distributed, the t-test is the most powerful test. The Wilcoxon test is useful for non-Normal data from large samples. The sign test is similar in power to the Wilcoxon for very small samples, but as the sample size increases the Wilcoxon test becomes much more powerful. In the Wilcoxon test, the difference between each pair of data is determined and then ranked. One assigns a positive or negative sign to the ranking, depending on the sign of the original difference. The positive and negative ranks are summed separately and are used to determine the level of statistic significance using a special table.

56
Q

Wad can we do to compare survival curve

A

statistical hypothesis test called the log-rank test. It is used to test the null hypothesis that there is no difference between the population survival curves

57
Q

Mann-Whitney U test

A

nonparametric test that is used to determine whether two sets of data are significantly “different.” As noted in the question, the variable being measured is ordinal. The two sets of data are assumed to be independent and randomly drawn (males vs. females). The statistics of interest in this test are the medians of both sets of data, and the test determines whether or not the difference in the medians is statistically significant at a given level.

58
Q

Chi square citeria

A

These criteria are (1) All expected values in each cell have a frequency count =1. (i.e. non-zero values) (2) at least 80% of total cells must have an expected value of =5.

59
Q

Kruskal-Wallis

A

an ANOVA equivalent used for 3 or more groups. For non parametric test

60
Q

What does Cronbach’s alpha measure?

A

internal consistency ( a type of reliability) - it can take any value between -infinity and +1.

61
Q

Three methods for performing regression

A

Stepwise regression
Forward selection
Backwards elimination

Three methods are described for performing regression: 1. Stepwise regression: Calculates coefficient of regression and starts with most significant to least significant independent variable and fits them in a stepwise fashion into regression equation. But a disadvantage is sometimes the statistically significant variables may not be clinically significant. 2. Forward selection: A confounding factor by definition is associated with both independent variable and dependent variable. As one may not know which is a confounding variable in some occasions (this is what one often examines with multiple regression), these are treated as covariates. While constructing a multiple regression equation, if the regression coefficient of a previously added variable changes then either one of the covariates is a confounder; so these are retained in the equation irrespective of statistical significance. The latter added covariate is discarded if no change occurs in regression coefficients. 3. Backward elimination: This starts with the final model - the full equation and tries to discard covariates one by one according to changes that occur in correlation coefficients.

62
Q

The steps in evidence-based practice according to Glazsiou (2000) are
Name 4

A
  1. Formulate an answerable question. 2. Track down the best evidence of outcomes available. 3. Critically appraise the evidence (to find out how good it is and what it means). 4. Apply the evidence (integrate the results with clinical expertise and patient values).
63
Q

Which of the following provides a consensus statement for reporting systematic reviews?

How abt clinical trials

A

Quorum

Consort

64
Q

Which one of the following is considered as a structured way to formulate a question when practising Evidence-Based Medicine ?

A

Pico

specify the population, intervention, comparators and outcome, which are also known as the PICO format. This is a useful guideline for treatment studies but may not fit all questions.

PECO is a mnemonic used to describe the four elements of a good clinical foreground question in EBM. P—Patient; I—Intervention or E - Exposure; C—Comparison and O–Outcome