Definitions from scratch Flashcards

Question

Mean

Answer 1

Uses all the data - each value is included Therefore subjected to effect from outliers and skew Cannot be performed on ordinal data

Answer 2

Values that divide a data set into 100 equal-sized group To find percentile, multiply percentage in decimals by (n+1) Where n is equal to number of data points

Answer 3

Lowest to highest value Not affected by skew Sensitive to outliers which may misguide range

Answer 4

Removes 25% from each end Reduces effect of outliers Affected by skewed distributions

Answer 5

Discards 50% of the data!

Answer 6

Average distance of the data values from their collective mean Uses each data point, i.e. uses all the data (unlike IQR)

Answer 7

Range and IQR Not standard deviation, can only be used on continuous data

Answer 8

If you have continuous data, SD is the measure of choice. However, if you use a median value, that has suggested you have skewed data. Therefore, you shouldn't use SD.

Answer 9

68% of values in range *for normally distributed data

Answer 10

95% of values in range *for normally distributed data

Answer 11

99% of values in range *for normally distributed data

Answer 12

Shapiro-wilk test - if less than 2000 values - provides p-value with null hypothesis set for normal disrubtion Kolmogorov-Smirnov test - >2000 values - provides p-value with null hypothesis set for normal disrubtion

Answer 13

Make it normally distributed log (to the base 10) = most common square-root 1 over value

Answer 14

Is actually the crude incidence rate Number of new cases of a disease or event over for a defined population given time period = number of new cases / number at risk (same time period)

Answer 15

ratio of two incidence rates

Answer 16

Number of cases in a given population at a given point in time

Answer 17

= number if deaths over a period time (usually 1 year) divided by population at mid-point of that time duration MULTIPLE by 1000 gives crude mortality per 1000 per year

Answer 18

Number of deaths from a disease in a given time period divide by total number with disease over that time period

Answer 19

Crude mortality rate divide by overall standardised mortality rate

Answer 20

A confounding variable: - is associated (casually or not) with the exposure - causally related to the outcome - must not be part of the exposure-outcome pathway

Answer 21

Leads to effect of exposure being inflated

Answer 22

Leads to effect of exposure being reduced

Answer 23

Occurs when the clinical indication for selecting a particular treatment (eg, severity of the illness) also affects the outcome. Indication for exposure leads to disease outcome Not exposure itself

Answer 24

Residual confounding is the distortion that remains after controlling for confounding in the design and/or analysis of a study. There are three causes of residual confounding: There were additional confounding factors that were not considered, or there was no attempt to adjust for them, because data on these factors was not collected. Control of confounding was not tight enough. For example, a study of the association between physical activity and age might control for confounding by age by a) restricting the study population to subject between the ages of 30-80 or b) matching subjects by age within 20 year categories. In either event there might be persistent differences in age among the groups being compared. Residual differences in confounding might also occur in a randomized clinical trial if the sample size was small. In a stratified analysis or in a regression analysis there could be residual confounding because data on confounding variable was not precise enough, e.g., age was simply classified as "young" or "old". There were many errors in the classification of subjects with respect to confounding variables.

Answer 25

1. Restriction - exclude all those with exposure hairball - limits generalisation of evidence e.g. if you exclude all smokers then results unlikely to be generalise to any population 2. Matching - choice of method in case-control studies e. g. frequency matching (same proportions) e. g. propensity score matching 3. Randomisation - choice of method in RCTs - controls for known and unknown confounding

Answer 26

1. Stratification - divides into strata, with and without exposure - essentially restriction but after the event 2. Adjustment - regression

Answer 27

The exposure-disease process is reversed; In other words, the exposure causes the risk factor. Lower employment status is linked to causing depression. It may well be depression is linked to causing employment status.

Answer 28

Do no infer any causality and only measure one variable (i.e. incidence) but can measure multiple - Generally not subject to confounding if only measure prevalence - If measures multiple things, will need to adjust for potential confounding

Answer 29

Attempts to asses potential links between two or more variables at a given time point - Does not infer causality - Need to be adjusted for confounding variables

Answer 30

- Take one set of measurements from each participant at a SINGLE point in time - Used to investigate associations between variables but NOT causality or direction - Not useful if condition is rare - If used to asses opinions or attitudes, referred to as surveys

Answer 31

Pros - Main purpose is to identify if exposures or risk factors cause a certain disease - Several outcomes can be studied for single exposure - Temporal relationship can be established - ->therefore adds to causality - Suited for rare exposures - Less subject to bias and confounding than case-control Cons - Sampling bias - Not suited for rare diseases - Long follow-up: leads to attrition and bias - Recall bias in retrospective studies - Data quality in retrospective studies

Answer 32

- Recall bias - Selection of cases difficult to find - Difficult to match patients for each variable - Sampling bias of cases +/- controls - Definition of a case e.g. GOLD 1 COPD is unlikely to clarify much

Answer 33

Make large-scale comparisons between two groups of people

Answer 34

Data from the sample will inform conclusions about the target population Using sample statistics, we are inferring about the population

Answer 35

Variable measured in a sample =sample statistics This is used to inform inferences regarding population parameters

Answer 36

Deviation from true value of a parameter in sampled population -usually unknown

Answer 37

Chance of an event occurring lies between 0 - 1 1 is absolute certainty of an event e.g. everyone will die some day 0 is an impossible outcome e.g. rolling an 8 on a dice labelled 1- 6 If an event is equally likely to happen as to not happen, the probability would be 0.5 If p is the probability of an event happening, the probability of the vents not happening is 1 - p

Answer 38

Used to calculate probability in clinical settings when outcomes do not all have an equal chance of occurring i.e. any clinical setting Proportional frequency states that the probability of an event occurring is equal to the proportion of times that outcome would have occurred if we repeated the experiment a large number of times

Answer 39

Simple randomisation Block randomisation -ensures at any given point there are roughly equal numbers in each group Stratification -ensures balanced strata of variables across each group

Answer 40

Blinding of participant

Answer 41

- Participants may undergone change between treatment 1 and treatment 2 - Does not work for treatments that require a long time to take effect - Does not work in self-resolving or acute illness that responds to therapy immediately - "Carry over" effect despite washout periods

Answer 42

Change in behaviour after knowledge of being observed Some trials do not recruit controls if data collection will not differ as Hawthorne effect changes outcomes

Answer 43

The process of analysing the data as if participants are still in the original group allocation despite loss or changeover of participants - Maintains baseline characteristics - Prevents attrition bias - Reflects real-world practice - Keeps sample size and power the same Cons - requires imputation - can sometimes underestimate effect size

Answer 44

Analysis performed as per treatments received by participant - protocol deviation has taken place - balanced baseline characteristics lost - attrition bias now in action - subject to confounding - loss of power - likely to overestimate effect size e.g. those most unwell are least likely to tolerate the side effects of new drug, hence only moderate disease is analysed in treatment group vs full-spectrum of severity in control group

Answer 45

Overcomes need for sampling frame Common sampling technique in randomised-controlled trials Units represent GP surgeries, hospitals, schools, clinics etc. Sampling units are a likely place to find spectrum of participants However, not a sampling frame - do not include everyone eligible and hence sampling and then selection bias will be introduced Example: 75 GP surgeries identified as eligible sampling units Randomly select 25 as your cluster sample -People who aren't registered with a GP have no chance of being included, hence not equal sampling probability

Answer 46

Used when calculating probability function in continuous variables pdf gives probability that a continupous random vairbale will lie between two values THIS IS BECAUSE: continuous variable have infinite possible number of outcomes, hence probability of a given outcome = 0

Answer 47

Probability of an outcome occurring in a population with exposure

Answer 48

Risk exposed divided by risk unexposed Same as risk ratio (decimal)

Answer 49

Risk exposed / risk unexposed Decimal Can over-inflate risk

Answer 50

risk reduction / risk in unexposed

Definitions from scratch Flashcards

(74 cards)