Statistics Flashcards

1
Q

precision with which a characteristic is measured

A

scale of measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

simplest level of measurement with qualitative/categorical observations described in proportions or percentages
dichotomous, binary

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

scale of measurement with inherent order among categories, the difference between 2 adjacent categories is not the same throughout the scale, described in proportions or percentages

A

ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

scale of measurement described with measure of central tendency and spread

A

continuous/numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

arithmetic average used for numerical data sensitive to extreme values

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

mean is best for what data

A

numerical, symmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

50th percentile, arrange observations from smallest to largest and determine number of observations, for numerical and ordinal data

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

median is best for what data

A

ordinal or numerical, skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

value occurring most frequently, used for numerical data to describe frequent observations in a large data set

A

mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

mode is best for what data

A

bimodal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

difference between largest and smallest observation

A

range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

identify the 25th and 75th percentile and find the difference to find the

A

interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

measures spread of data around the mean, statistic of interest, can determine skewness

A

standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

a percentage of a distribution that is ≤ a certain number

A

percentiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

measures relative spread in data, used to compared variability across data sets, unitless

A

coefficient of variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

the area between the mean and 1 SD above or below the mean

A

data distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

parametric tests assume data is

A

normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

types of parametric tests

A

student’s t-test, ANOVA, ANCOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

types of nonparametric tests

A

sign test, Wilcoxon rank-sum test, Mann-Whitney test, Kruskal-Wallis test, Spearman’s rank correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

type of test used for non-normally distributed data, and for nominal and ordinal data

A

non-parametric test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

type of distribution that applies only to continuous data achieved by picking subjects randomly, creating equal sample sizes, and obtaining large samples

A

normal (Gaussian)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

type of distribution where data falls to the left or right of the mean

A

skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

graphs for 2 characteristics for continuous data, every point is displayed, and displays exact distribution of the data

A

scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

graph that shows continuous data that involved 5 values
min, 1st quartile, media, 3rd quartile, max

A

box plot (box and whisker plot)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

graph that involves mean and SD data

A

error bar plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

graph where frequency is represented by bars, used with nominal or ordinal data, the measure of interest on the x-axis, number of percentage observations on the y-axis

A

histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

line graph like a histogram used to compare 2 distributions on the same graph

A

frequency polygon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

graph with sum of frequencies accumulated up to a specified boundary

A

cumulative frequency graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

a measure of the relationship between 2 numerical characteristics

A

correlation coefficient (r)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

what do correlation coefficients -1, 0, and +1 say about the relationship

A

-1 = perfect negative linear
0 = no linear
+1 = perfect positive linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

describes the relationship between two ordinal characteristics, or 1 ordinal and 1 numerical

A

Spearman’s rank correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

compares probabilities of developing an outcome in the presence or absence of a treatment risk factor

A

relative risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

compares odds of developing an outcome on the presence of a risk factor

A

odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

comparison of hazard rates

A

hazard ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

quantifies likelihood or benefit, # of patients needed to be treated to avoid 1 outcome

A

number needed to treat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

refers to risk of detrimental effect, # of patients needed to be treated to cause an adverse event in 1 patient

A

number needed to harm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

number of times an event occurred / total number of trials

A

probability

38
Q

number of times we checked to see if an event would occur

A

trials

39
Q

number of times an event occurred / number of times an event did not occur

A

odds

40
Q

large set or collection of items that have something in commone

A

population

41
Q

subset or population selected to be representative of the population

A

sample

42
Q

group that the sample is meant to represent

A

target population

43
Q

population from which a sample was actually drawn

A

sampled population

44
Q

population values

A

parameters

45
Q

sample values

A

statistics

46
Q

characteristics of interest in a study that varies between subjects

A

variable

47
Q

variables in a study in which the subjects are randomly selected

A

random variables

48
Q

summary of random variables in a frequency distribution based on probabilities

A

probability distribution

49
Q

variability among individuals

A

standard deviation

50
Q

variability among samples, usually approximated with sample SD instead of population SD

A

standard error

51
Q

provides a single value estimate for something that is a value of interest

A

point estimates

52
Q

indicates variability of an estimate

A

confidence interval

53
Q

permits generalizations from a sample to the population from which it came, the likelihood of results obtained in a sample

A

hypothesis test

54
Q

distribution of individual observations is different than distribution of means (a sampling distribution)

A

sampling distribution

55
Q

methods used to draw conclusions from a sample and make inferences to the entire population.

A

inferential statistics

56
Q

2 events that cannot happen at the same time

A

mutually exclusive
p(A or B) = A + B

57
Q

events that have no connection to each other at all

A

independent events
P (A and B) = A x B

58
Q

chance of something happening when you know another event is going to happen

A

conditional probability
p (A|B) = prob of A given B

58
Q

events that can happen at the sane time

A

non-mutually exclusive events
p (A or B) = A + B - (A and B)

59
Q

select sample with subjects having equal probability of being selected

A

simple random sampling

60
Q

every #th item is selected

A

systematic random sampling

61
Q

divide population into relevant strata and sample randomly from each stratum

A

stratified random sampling

62
Q

divide population into clusters and randomly select clusters for inclusion

A

cluster random sampling

63
Q

probability that a subject is selected is unknown and may reflect selection bias

A

non-random sampling

64
Q

distribution type for when the event is a binary outcome, gives the probability that a specified event occurs in a given number of independent trials

A

binomial

65
Q

which values are needed to describe binomial distribution

A

n and pi

66
Q

distribution used to determine the probability of rare events

A

poisson

67
Q

distribution type used for continuous and symmetric data used commonly in statistical analysis

A

gaussian (normal)

68
Q

parameter needed for poisson

A

λ

69
Q

parameter needed for gaussian

A

mean and SD

70
Q

what is the mean and SD for normal distribution

A

mean = 0
SD = 1

71
Q

what effect does change have on statistical significance

A

increase –> more sig
decrease –> less sig

72
Q

what effect does variation have on statistical significance

A

increase –> less sig
decrease –> more sig

73
Q

what effect does sample size have on statistical significance

A

increase –> more sig
decrease –> less sig

74
Q

what is the primary assumption for using t-tests

A

assumes data is normally distributed

75
Q

error that occurs when we reject a true null hypothesis

A

type I
5%

76
Q

type of error that occurs when we do not reject a false null hypothesis

A

type II
20%

77
Q

probability of a type I error

A

alpha

78
Q

probability of a type II error

A

beta

79
Q

probability of accepting the null hypothesis when it is indeed true

A

power

80
Q

probability of obtaining a result as extreme as the one observed through chance alone

A

p-value

81
Q

which test can increase our ability to detect differences and decreases variability

A

paired tests

82
Q

test that compares binary outcomes (binomial data)

A

McNemar test

83
Q

test similar to McNemar but with continuous data

A

Sign Test

84
Q

test that assumes symmetric distribution of differences around the mean using continuous data, almost as powerful as t-test, calculated by absolute value of change by a pair

A

Wilcoxon Signed Rank

85
Q

typical alpha value

A

0.05

86
Q

typical power value

A

80% (B=0.2)

87
Q

impact on sample size needed for change

A

larger –> fewer sub
smaller –> more sub

88
Q

impact on sample size needed for variation

A

larger –> more sub
smaller –> fewer sub

89
Q

impact on sample size needed for alpha

A

larger–> fewer sub
smaller –> more sub

90
Q

impact on sample size needed for power

A

larger–> more sub
smaller –> few sub