Statistics Flashcards

Question 1

Q

what is the internal validity of a study?

Answer

A

the extent to which a study establishes a trustworthy cause and effect

Question 2

Q

what is the external validity of a study?

Answer

A

the extent to which the results of the study can be applied to real life

Question 3

Q

what are three things that can affect the validity of a RCT?

Answer

A

bias - different types of bias
confounding factors
chance

Question 4

Q

what is selection bias?

Answer

A

bias when assigning individuals to groups which may lead to differences that can affect the outcome. There are three types.

Question 5

Q

what are the three types of selection bias?

Answer

A

sampling bias - subjects are not representative of the population

volunteer bias - people with the condition may not volunteer willingly

non-responder bias - some populations may be less likely to respond to the study so are less represented

Question 6

Q

what is prevalence/incidence bias?

Answer

A

when a study is investigating a condition that is characterised by early fatalities it may miss earlier cases from the calculations

Question 7

Q

what is recall bias?

Answer

A

difference in accuracy of the recollections retrieved by study participants, possibly due to whether they have a disorder or not

Question 8

Q

what studies does recall bias typically affect?

Answer

A

case-control studies

Question 9

Q

what is publication bias?

Answer

A

failiure to publish results from valid studies, often as they showed a negative or uninteresting results

Question 10

Q

what is work up bias (verification bias)?

Answer

A

in studies which compare new diagnostic tests to the gold standard, work up bias can be an issue as the clinician may be reluctant to order the gold standard test, unless the new test is positive due to invasiveness or price of gold standard test

Question 11

Q

what is expectation bias?

Answer

A

observers may subconsciously measure of report data in a way that favours the expected study outcome - only affects non-blinded trials

Question 12

Q

what is the hawthorne effect?

Answer

A

describes a group changing its behaviour due to knowledge that it is being studied

Question 13

Q

what is late look bias?

Answer

A

gathering information at an inappropriate time e.g. studying a fatal disease many years later when many patients may have died already

Question 14

Q

what is procedure bias?

Answer

A

occurs when subjects in different groups receive different treatments

Question 15

Q

what is lead time bias?

Answer

A

occurs when two tests for a disease are compared, the new test diagnoses the disease earlier but there is not actual difference in the outcome of the disease

Question 16

Q

what are the two different ways of sampling patients for a study?

Answer

A

probability sampling - means everyone included in the sample has equal probability of being chosen

non-probability sampling- not everyone has equal probability of being chosen

Question 17

Q

what are 4 methods for probability sampling?

Answer

A

simple random sampling
systematic sampling
stratified sampling
clustered sampling

Question 18

Q

what is an example of simple random sampling?

Answer

A

using random number generator to assign patients with random number, then randomly assigning these numbers to groups

Question 19

Q

what is an example of systematic sampling?

Answer

A

every 5th patient assigned

Question 20

Q

what is an example of stratified sampling?

Answer

A

split the group into male and female and select equal participants

Question 21

Q

what is clustered sampling?

Answer

A

select subgroups within the population - useful in primary care

e.g. divide all GP practices in the city in clusters, then randomly select a few GP practices (clusters), then include all the patients from the selected GP practices in the study

Question 22

Q

how does clustered sampling and stratified sampling differ?

Answer

A

clustered sampling - allocating participants based on clusters (natural groups e.g. GP practices, school, hospitals) - this is logistically easier

stratified sampling - allocating participants based on clusters or characteristics (e.g. age, gender, ethnicity) - this is when you want a proportional representation of different subgroups in the sample

Question 23

Q

what are 4 examples of non-probability sampling?

Answer

A

convenience sampling
quota sampling
judgement/purposive sampling
snowball sampling

Question 24

Q

what is convenience sampling?

Answer

A

first come first serve - i.e. ask people to sign up for a study and just take the participants that come forwards

Question 25

Q

what is quota sampling?

Answer

A

the population is divided into subgroups by age, gender or ethnicity and a quota is set for filling each subgroup. The participants are then selected non-randomly until the quota is filled (i.e. by convenience, first come first serv)

Question 26

Q

what is purposive sampling?

Answer

A

participants are chosen based on specific criteria i.e. you would specifically contact patients with a disease, because you want to study it - rather than randomly sampling the general population

Question 27

Q

what is snowball sampling?

Answer

A

participants recruit other participants - good for hard to reach or isolated groups

Question 28

Q

what is a cohort study?

Answer

A

take a cohort of people are study them over time - this is observational and prospective. Two or more groups are selected based on their exposure to a particular agent (e.g toxin, smoking) and are studied over time to see how many develop a disease or other outcome.

Question 29

Q

what measure is usually used to measure the outcome of a cohort study?

Answer

A

relative risk - as you compare the two groups

Question 30

Q

what is a case-control study?

Answer

A

participants with a particular condition are matched with controls. This is observational and retrospective. Data is then collected on the past to identify a possible causal agent for the condition.

Question 31

Q

what outcome measure is usually measured in a case-control study?

Answer

A

odds ratio

Question 32

Q

what are the positives of a case-control study?

Answer

A

inexpensive
produces quick results
useful for rare conditions

Question 33

Q

what are the negatives of a case-control study?

Answer

A

usually prone to confounding factors

Question 34

Q

what is a cross sectional study?

Answer

A

provides simply a snapshot in time - sometimes call prevelance studies

provides weak evidence of cause and effect

Question 35

Q

what is crossover trial?

Answer

A

participants experience both the experimental arm and the placebo - useful if it is unethical to deprive patients of a particular treatment (ie. cancer treatments)

Question 36

Q

what is a quasi experimental study?

Answer

A

participants are chosen who have ALREADY been exposed to the experiment where is unethical to expose them i.e. children playing violent video games

Question 37

Q

what are the pros of a cohort study?

Answer

A

helpful when the exposure is unethical - as participants already have the exposure
can measure multiple outcomes
cheap
can analyse risk

Question 38

Q

what are the cons of a cohort study?

Answer

A

participants can be lost to follow up
can be affected by recall bias if retrospective
confounding variable

Question 39

Q

what is the incidence?

Answer

A

rate at which new cases occur in a population over time i.e. 10 new cases in 1000 per year

Question 40

Q

what is the prevelance?

Answer

A

total no of cases of a disease that currently exist at any given time i.e. currently 50,000 people with asthma

Question 41

Q

what study is used to measure prevelance?

Answer

A

cross sectional study

Question 42

Q

what study is used to measure incidence?

Answer

A

cohort study

Question 43

Q

which is the best type of study that is considered gold standard?

Answer

A

meta analysis / systematic analysis

Question 44

Q

what is grounded theory?

Answer

A

in qualitative research - it is a method used to generate a new theory about a phenomena of interested from the collection of new data. The new theory needs to be grounded or rooted in observations made - i.e. the name.

It is a complex process, which begins by raising questions that help guide research but are not static or confining and then over time core theoretical concepts are identified.

Question 45

Q

what is ethnography?

Answer

A

the aim is to study an ENTIRE culture, through the researcher becoming immersed in the culture as an active participant and recording field notes.

Question 46

Q

what is phenomenology?

Answer

A

the goal of phenomenology is to describe the real “lived experience” of a phenomenon

Question 47

Q

what are the 4 types of sampling for qualitative data?

Answer

A

1 - convenience
2 - purposive
3 - snowballing
4 - case study - select a single individual

Question 48

Q

what are 4 ways of assessing the validity of a qualitative study?

Answer

A

1 - triangulation
2 - respondent validation (aka member checking)
3 - bracketing
4 - reflexivity

Question 49

Q

what is triangulation?

Answer

A

comparing the results of two or more different methods of data collection (for example - interviews and observation)

Question 50

Q

what is respondent validation?

Answer

A

techniques where the investigators account is compared to the participants in order to check the level at which they correspond

Question 51

Q

what is bracketing?

Answer

A

deliberating putting asides ones own beliefs about the phenomenon under investigation

Question 52

Q

what is reflexivity?

Answer

A

sensitivity to the ways in whcih the researcher and research process have shaped the collected data

Question 53

Q

what are consensus methods in qualitative research?

Answer

A

the way in which the researchers aim to gain a general agreement around a topic

Question 54

Q

what are two methods of consensus in qualitative research?

Answer

A

delphi method
nominal group technique

Question 55

Q

what is the delphi method?

Answer

A

aims to gather opinions from experts in a particular area. Occurs in 3 stages:
stage 1 - open ended questionnaires sent to participants to generate statements about the topic
stage 2 - participants then asked to rank all of the statements produced in stage 1
stage 3 - statements are further refined and re-ranked to achieve consensus

if consensus not achieved in stage 3 then that stage can be repeated

Question 56

Q

what is nominal group method of consensus?

Answer

A

group of highly structured meetings with a controlled discussion
members independently record ideas and opinions, which are then re-presented to the group and used to clarify and categorise ideas
group members are then asked at the end to rank the ideas to achieve consensus

Question 57

Q

what are the two types of qualitative data that can be collected?

Answer

A

nominal data - data is placed into named categories - there is no hierachy given to these categories, you can count but not order them (i.e. birthplace)

ordinal data - observed values can be put into categories which can be ordered (ie NHYA classification of heart failure symptoms)

Question 58

Q

what are the 4 types of quantitative data?

Answer

A

discrete - values are finite whole numbers i.e. number of asthma exacerbations per year

continuous - data can take any value i.e. weight

binomial - data can have two values (i.e. biological sex)

interval - measurement between the two values is meaningful i.e temperature (not the same as continuous as body temp cannot be 0)

Question 59

Q

what is the null hypothesis?

Answer

A

prediction of no relationship between the two variables being tested

Question 60

Q

what is the alternate hypothesis?

Answer

A

predicts a relationship does exist between the two variables being tested

Question 61

Q

what is a type 1 error?

Answer

A

the null hypothesis is rejected when it is true (i.e. showing that there is a difference between groups, when actually there is not - false positive)

this is determined against a preset significance level of alpha

Question 62

Q

what is a type 2 error?

Answer

A

the null hypothesis is accepted when it is false i.e. saying there is no correlation between groups when actually there is

this is termed a beta error

Question 63

Q

what is the power of a study?

Answer

A

the power is the probability of correctly rejecting the null hypothesis when it is false

Question 64

Q

how is the power of a study calculated?

Answer

A

1 - the probability of a type II error (i.e. beta) - so can also be calculated as

1 - beta

Answer 65

A

by increasing the sample size

Answer 66

A

measure of whether the independent variable (cause) has an impact on the dependent variable (effect)

Answer 67

A

situation where two phenomena occur together - these could either be related or by chance

Answer 68

A

spurious - relationship between the variables occurs purely due to chance

indirect - relationship between the two variables is due to a confounding factor

direct - there is a true association between the two variables

Answer 69

A

bradford hill criteria

Answer 70

A

research method used to measure the relationship between two variables - measured as “p” value where p=0 is no correlation and P = 1 is perfect correlation

Answer 71

A

data that follows a normal distribution

Answer 72

A

data that does not follow a normal distribution

Answer 73

A

students T test -
pearsons coefficient

Answer 74

A

is the consistency of the data - can it be replicated consistently to produce similar results

Answer 75

A

whether a test accurately measures what it is supposed to measure

Answer 76

A

sum of all values / total number of values

Answer 77

A

sort all the values into order and select the middle value

Answer 78

A

most common data appearing in the data set

Answer 79

A

postive skew

Answer 80

A

negative skew

Answer 81

A

normally distributed

Answer 82

A

68.3% lies within 1SD of the mean
95.4% lies within 2 SD of the mean
99.7% lies within 3 SD of the mean

Answer 83

A

cohort study

Answer 84

A

no of events/total no in the group

Answer 85

A

no of events in experimental group /
total no in the experimental group

Answer 86

A

no of events in the control group / total no of participants in the control group

Answer 87

A

EER / CER

Answer 88

A

CER - EER

Answer 89

A

EER - CER

Answer 90

A

( CER - EER ) / CER

OR 1- RR

Answer 91

A

EER - CER / EER

Answer 92

A

case control studies

Answer 93

A

no of people with event / no of people without the event

Answer 94

A

odds of exposure / odds of control

Answer 95

A

range or interval of values in which the “true” value lies - i.e. confidence interval of > 95% - you are 95% confident that the true result lies in the range, with a 5% chance that it lies outside of this range

Answer 96

A

t-tests compares the means of two samples only

ANOVA - compares the mean or two or more samples (i.e. if you had groups of 20-30yrs, 30-40yrs, 40-50ys ANOVA would be used to compare the means across these different groups)

Answer 97

A

ordinal, interval, or ratio scales or unpaired data

Answer 98

A

compares two sets of observations on a single sample i.e. before and after test on the sample population following an intervention

Answer 99

A

used to compare proportions or percentages across patients following two different interventions

Answer 100

A

correlation between two variables

Answer 101

A

forest plot

Answer 102

A

funnel plot

Answer 103

A

y = a + bx

a = point at which the line crosses y axis where x = 0

b = coefficient line

x= chosen value on x axis

Answer 104

A

phase 0 - exploratory studies - very small no of participants to explore the effect of the drug in the human body

phase I - safety assessment - determines SE prior to larger studies, conducted on health volunteers

phase II - assess efficacy - involves a small no effect by the disease

phase III - assess effectiveness - thousands of particpants RCT

phase IV - monitoring for long term SE and effectiveness

Answer 105

A

correlation is a calculation of how closely one variable relates to another variable.

linear regression is then used to predict how much one variable may change when a second variable is changed. this is when you use the formula y= a+ bx

Answer 106

A

parametric data - pearsons
non-parametric data - spearmans

Answer 107

A

screening tool correctly identifies the patient as having the disease

Answer 108

A

screening tool correctly identifies the patient as not having the disease

Answer 109

A

the screening tool incorrectly identifies the patient as having the disease, when infact they do not

Answer 110

A

the screening incorrectly identifies the patient as not having the disease, when in fact they do

Answer 111

A

proportion of patients with the disease who have a POSITIVE result

Answer 112

A

people with the disease (TP-FN)

Answer 113

A

proportion of patients without the disease who have a negative result

Answer 114

A

people without the disease (TN + FP)

Answer 115

A

the probability that a person with a positive test result actually has the disease

Answer 116

A

TP / (TP + FP)

Answer 117

A

the probability that a person with a negative test result actually does not have the disease

Answer 118

A

TN / (TN+FN)

Answer 119

A

CEA compares a number of interventions by relating costs to a single clinical measure of effectiveness (e.g. symptom reduction, improvement in activities of daily living).

Answer 120

A

total cost / unit of effectiveness

Answer 121

A

CBA is a technique in which all the costs and benefits of an intervention are measured in terms of money. A CBA is used to establish which of the alternatives has the greatest net benefit.

Answer 122

A

CUA is a special form of CEA in which health benefits / outcomes are measured in broader, more generic ways enabling comparisons between treatments for different diseases and conditions - i.e. using QALY’s.

Answer 123

A

QALYs are a composite measure of gains in life expectancy and health-related quality of life. One QALY is equal to 1 year of life in perfect health.

Answer 124

A

CUA offers something that CEA cannot, which is to compare across treatments for different conditions. In principle, it is possible to compare treatments for, say, cancer with, say, schizophrenia to determine which is the most efficient at producing health gain in the form of QALYs.

Answer 125

A

Direct - those associated directly with the healthcare intervention (e.g. staff time, medical supplies, cost of travel for the patient, childcare costs for the patient, costs falling on other social sectors such as domestic help from social services)

Indirect - those incurred by the reduced productivity of the patient (e.g. time of work, reduced work productivity, time spent caring for the patient by relatives)

Intangible - those that are difficult to measure (e.g. pain or suffering on the part of the patient)

Answer 126

A

how much the odds of the disease increase when a test is positive

Answer 127

A

sensitivity / (1-specificity)

Answer 128

A

how much the odds of a disease decrease when a test is negative

Answer 129

A

(1-sensitivity) / specificity

Answer 130

A

a placebo that produces prominenet SE

Answer 131

A

P value - is the probability of obtaining a result by chance at least as extreme as the one that was actually observed, assuming that the null hypothesis is true