Statistics Flashcards

0
Q

Statistic of central tendency for nominal data

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Yes or no data

A

Nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Standard error of mean from std dev

A

Std error of mean = standard deviation / sq root of sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Categorical variable for nominal

A

Fishers exact

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Odds ratio interpretation for OR of 1.18 (ci 95% 1.04,1.33)

A

Risk of event elevated by 4% to 33% and statistically sunificat (OR doesn’t include 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Type of trial you use odds ratio to measure significance

A

Case control , sometimes cross sectional or cohort with some modifications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Calculate odds ratio

A

A/c divided by b/d = ad/bc

Where 
a = exposed cases
B = exposed non cases
C = unexposed cases
D = unexposed non cases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Continuous data

A

Data along an infinite or finite continuum that can be broken down into an jndinite degree of detail - weight, temperature, etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When would you use kruskal wallis test?

A

Non parametric and ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Panns represents which type of data

A

Continuous, even though made up of multiple ordinal scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Central tendency stat for ordinal data (ranked in order)

A

Median (mean not appropriate since data are categorical and not to be treated as continuous)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Types of continuous variables with examples

A

Interval and ratio

Interval - eg temperature degrees Celsius - equal intervals and zero is arbitrary

Ratio - like interval but there is a true zero - ex: weight, blood pressure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Test to see of data normally distributed

A

Kolmogorov-smirnov

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

2 discrete probability distributions

A

Binomial - only two different outcomes like heads or tails

Poisson - another probability distribution when you count a number of events across times - ex: number of ADRs from drug x over a time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kurtosis

A

How flat a distribution is - normal distribution = 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Skewness - symmetry of distribution - is data clustered at low end positively or negatively skewed?

A

Low end - positively skewed - outliers on the high end pull mean in higher direction so mean is higher than median

High end - negatively skewed - low numbers pull mean down so mean is lower than median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Standard error of the mean

A

Different than sd- doesnt tell you how values compare to mean, tells you how this samples mean compared to othersAmples from same population

  • for more than 1 sample studies
  • is sd/sq root of n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Non parametric test criteria

A

Non normally distrib data
Eg nominal or ordinal variables with sample size under 30
Also, scales - ordinal - with less than 12 categories eg panss

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Defn of beta

A

Probability of making a type II error

Usually < 0.2, pref < 0.10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Defb of alpha

A

Prob of type I error

Inversely related to beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Continuous variable parametric test

- compare two means?

A

If independent samples - t test (student )

Paired or matched data - paired t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Comparison of 3 or more groups

A

One way anova - helps avoid type I error

  • performs multiple t tests
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Anova detects what?

A

A difference among the 3 of more groups

  • then, a multiple comparison method must be employed to detect which difference
    • dunnet, bonfsrroni, tukey, etc
  • repeated measures anova - subjects in these are paired and serve as own control (participate in >1 treatment group)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Nominal variables (nonparametric tests)

A

Chi square test

  • ex: test diff of baseline characteristics sex, smoking status, alcohol, yes/no variables like this
  • tests observed vs expected frequencies
  • must be larger samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Nominal variables (non parametric) besides chi square

A

Fishers exact - when sample is <20 or expected 2x2 cells is less than 5

Mcnemar - similar to chi sq but for paired or matched data

Mantel-haenszel - to see if one factor is influencing the results - uses separate contingent tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Ordinal data (non parametric test) for 2 groups

A

Mann Whitney- non para equiv to student t
– no paired groups

Sign test - matched or paired data - tells whether pos or neg difference

Wilcoxon signed rank test
- determines magnitude of diff and rank order of differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Ordinal non parametric test with 3 or more groups

A

Kruskal wallis one way Anova
- data not matche or paired

Friedman two way anova
- data are paired or matched

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Correlation - which test is for parametric and which is for ordinal

A

Pearson corr coeff- for parametric , ranges from -1 to 0 to 1

Spearman rank corr coeff - ranks the strength of correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Regression - when to use logistic vs linear

A

Linear regression - continuous variables (parametric)

Logistic regression - ordinal or nominal data - non parametric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Survival analysis notes

A

Censoring - takes into acct that some subjects leave study for different reasons and can enter study at different time points

Actuarial method - counts number subjects who reach a certain point
- ex- pt who dies at 5 months 29 days isn’t included in the 6 month analysis

Kaplan Meier - measures time to endpoint
- produces life table and survival survey

Cox hazards proportional - allows researcher to adjust for differences in study groups (age, comorbidities)
- produces hazard ratio and CI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Incidence

A

Number of new cases that occur in a popn in a specified time (number of new cases can trend over time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Prevalence

A

Number of cases in the population who HAVE disease in a specific time frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

2x2 table

A

Dz + Dz-
Rf + A B A+B
Rf- C D C+D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Relative risk

A

Actual or true risk
Used in prospective and ecperdnral studies
RR = (A/A+B) / (C/C+D)

Ex: prospective cohort study to evaluate subj taking antipsychotics and development of dm - take subj with and without antipsychotic use and calculate RR to see if dm associated with antipsychotic use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Odds ratio

A

Estimates Relative risk
Used in case control and cross sectional studies

OR =( A/C) / (B/D) = AD/BC
Study subjects are selected on basis of disease status so it is not possible to calculate te rate of development of the disease given presence or absence of exposure - thus, OR used to approximate RR or estimate risk

35
Q

OR and RR interpretation

similarities

A

Both used to determine magnitude of association between exposure to risk factor and disease

Same scale - >1 means correlates with association with development of dz, < 1 means protection, and =1 means no association

If 95%CI includes 1 => not stat sig

36
Q

Relative risk reduction

A

Estimates % of risk that is reduced by result of the intervention

= 1-RR
OVERestimates true risk because divided by proportion of control group outcome rate. So often the benefit of a treatment is Overstated!

37
Q

Absolute risk reduction

A

Rate in intervention group minus rate in control group

38
Q

2x2 table for diagnostic test accuracy

A

Dz + Dz -
Test + TP FP
Test - FB TN

39
Q

Sensitivity

A

Probability that a true pos test occurs in an individual pos for dz

Sensitivity = TP/(TP+FN) *100
= true pos / people who have disease

Highly sensitive test rules out dz SNOUT

40
Q

Specificity

A

Probability that a true negative test result occurs - neg test in neg pt

Specificity = TN/(FP+TN)
= people who test negative / people without the disease

Spin - highly specific test rules in (confirm) disease

41
Q

Positive predictive value

A

PPV = TP/(TP+FP) *100

PPV = proportion of individuals who have diseas when test is positive = likelihood a person with pos text has disease

42
Q

Negative predictive value

A

NPV = TN/((FN + TN)

Proportion of disease free persons who test negative - likelihood that a person with negative test doesn’t have disease

43
Q

Case control

A

Analytical observational study
- retrospective -
Use in new diseases or outbreaks
Measure of association = odds ratio

44
Q

Cohort study

A

Analytical observational study
Strongest observational study design
Usually prospective
- relative risk is measure of association

45
Q

Cross sectional

A

Aka prevalence study
Descriptive obs study
- user to gather info on risk factors and outcomes of interest
- generate hypotheses

46
Q

Case report / case series

A

Descriptive obs study
Generate a case dfn
Determine adv effects , generate new info

47
Q

Effect size

A

Effect size = d = cohens d
(mean experimental grp - mean control group) / st dev

Interpretation - tells us how many st dev of difference between exp and control. Eg if d=0.25, means that there is a quarter sd difference

  1. 2 = small effect size
  2. 5 medium
  3. 8 large
48
Q

Analytic studies vs descriptive

A

Analytic - case control and cohort - involve more comprehensive data

Descriptive - cross sectional an case report/series - compare disease frequency in populations, generate hypotheses

49
Q

Internal validity

A

Does the study measure what it was designed to measure?
Does it address biases, confounders?
** if you do not have internal validity, you won’t have external validity

50
Q

External validity

A

Assumes internal validity (measures what intended, addresses biases confounders and outcomes)
- external validity means outcomes can be generalized to other groups or patients, including your clinic population

51
Q

Selection bias

A

Selection of study participants
- includes sampling bias - researcher chooses study participants based on convenience rather than representativenes

Detection bias- individuals who have risk factors - leads to more medical encounters - increase probability dz is identified

Admission rate bias (Berkson’s)
- specific to using case and controls inpatients - exposure and disease being studied leads to higher exposure rate among hospital cases than controls. Example: OCuse lead to DVT- higher referral rate to hospitals

Response bias - individuals who participate are different than those who decline to participate

52
Q

How to minimize selection bias

A
  • Define study cases in a detailed and objective manner

- enroll a representative study sample in the study

53
Q

Information bias

A

In accuracy in collecting data

Recall bias- different memory of past events

- people w disease recall more detail than healthy people - case control and retrospective cohort are most vulnerable 

Interviewer bias - differences in obtaining info from subjects

54
Q

How to minimize recall bias

A
  • Confirm pt response through medical records

- use a control group w disease other than that being studied

55
Q

How to minimize interviewer bias (type of information bias)

A

Detailed training of interviewers
Directions to study staff conducting interviews and surveys
Supervision of data collection process

56
Q

Follow up or attrition bias

A
  • study participants lost to follow op
  • prospective study most vulnerable
  • difficult to minimize but assess reasons for loss
57
Q

Misclassification bias

A

Inaccuracy in measurement or placement of study participants
- mismeasurement, or if someone was thought to have disease on study entrance but does not

Sources of miss classification bias:

  • variation among study observers and instruments
  • variation in underlying characteristics
  • misunderstanding of questions by study subjects (interview or questionnaire)
  • incomplete medical record data
58
Q

Compliance or adherence bias

A

One treatment that pts adhere to better than another

59
Q

How to address bias

A

Proper study design
Conduct of study - selection of pts, procedures, supervision and training

statistical analysis:

  • difficult to accomplish because no stat test can correct for bias or fix study flaw
  • using appropriate stat procedures for data analysis can help with bias
60
Q

Confounding variables

A
  • falsely conclude that a rf is associated with a disease without adjusting for rf that are either known or unknown
    1) confounders can influence study results have the potential to influence study results
    2) researchers may not account for these, or even be aware of their existence!
61
Q

Controlling for confounders

A

1) randomization - ensures confounders are evenly distributed
- not done in epidemiology studies like case control, retro, and cohort
2) restriction -
Restric admission to study to certain category of confounders
- matching - equal representation of subjects with certain confounders among study groups
- over matching - strong association between variable and variable of interest that decreases ability to find a result. Do not match based on factors affected by disease or exposure eg signs and sx because this decreases ability to find a result

3) analysis - stratification - data are split into non-overlapping groups called strata where a specific factor is contained in separate strata to see if each may contribute to effect
- multivariate regression analysis - can control for a number of confounders at same time without losing power
-

62
Q

Criteria to establish causality and not just association or relationship

A
Strength of association
Reproducibility
- different populations different times
Temporal sequence 
- has to happen before 
Biological plausibility
Dose response relationship
     - can be, but not necessarily 
Coherence of relationship
  • strength of association
    A) stronger the association, the less likely it is due to chance alone
    B) but, just because the magnitude is low doesn’t mean there is no cause and effect
63
Q

Study design strength from strongest to weakness re what can be concluded for results and causality

A
RCT- strongest design for cause effect and differences in tx effect 
Cohort
Case control
Case series
Case report - weakest causality
64
Q

Observational study - appropriate?

A

Case control, cohort, cross sectional
- appropriate for studying natural history of disease, accuracy of dx test, or public health policy - program planning etc

65
Q

Hypothesis evaluation

A
  • is it an answerable question?
  • sufficiently narrow and objective?
  • use SMART criteria
  • biological, temporal, and time frame plausibility
66
Q

Subjective vs objective outcomes - what is PANSS?

A

It measures subjective - psych sx, but validation and standardization minimizes variability

67
Q

Primary vs secondary data sources

A

Primary -‘measured directly by researcher for purpose of ongoing study - rct, cohort, case control, cross sectional

Secondary- from databases or pt medical records. Data is already collected, researcher gains permission to access for study (retrospective cohort, case control and cross sectional)

  • advantage: not as costly and doesn’t take as much time to acquire data
    - disadvantage: missing data can impact accuracy of results and data may be miscoded eg ICD codes done wrong
68
Q

Data analysis and interpretation

A

Obs studies
Case control study - Odds ratio

Cohort study - relative risk w CI

69
Q

Post hoc analyses - what is it good for

A

Generating hypotheses

70
Q

Quality of evidence

A

US preventive service task force
level 1: evidence obtained from at least one well designed RCT
level 2-1: well designed controlled trials without randomization
Level 2-2: well designed cohort or case control trial, pref from >1 center
Level 2-3: evidence obtained with multiple time series with or without intervention
Level 3: opinion of experts, case reports or series

71
Q

Efficacy vs effectiveness

A

Efficacy is narrow term used to describe outcomes in studies

Effectiveness is broad and defines a real world outcome

72
Q

Survival analysis - psych trial issues

A
  • usual presented graphically, without confidence intervals

- doesnt explain the impact of drop outs on power in studies

73
Q

Generalizibiluty of studies

A

One factor:

- would exclusion criteria for the study exclude pts in our practice?

74
Q

Relative risk

A

Incidence in group a divided by incidence in group b

75
Q

Odds ratio function

A

Estimates relative risk in retrospective studies

76
Q

When does odds ratio overestimate risk?

A

When the incidence is >10% the odds ratio overestimates risk - over 10 over estates

When incidence is s a decent estimate

77
Q

Odds ratio calc

A

Exposed cases / unexposed cases
Divided by
Unexposed cases/ unexposed non-cases

78
Q

Relative risk calculation

A
For prospective study
RR=
A/(a+b)
Divided by 
C/(c+d)
79
Q

Observer bias

A

Minimized by blinding esp double blinding

80
Q

Allocation bias

A

Can occur in experimental studies when patients are randomized

81
Q

Information bias

A

Occurs in observational studies where must rely on existing sources of information

82
Q

Misclassification bias

A

Occurs when inaccuracies or measurement or placement of study patients in specific groups. Most vulnerable: case control and retrospective studies

83
Q

Ordinal test equivalent to student t test

A

Mann Whitney U test

Used she a comparison is being made with 2 non paired groups which don’t have to be equal size - non paired means subjects don’t have to participate in all treatment (don’t have to serve as own controls)

84
Q

Two different tests for comparing nonparametric nominal data

A

Chi square for large sample Chicago large

Fishers exact for sample size less than 20