Statistics Flashcards

1
Q

precision with which a characteristic is measured

A

scale of measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

simplest level of measurement with qualitative/categorical observations described in proportions or percentages
dichotomous, binary

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

scale of measurement with inherent order among categories, the difference between 2 adjacent categories is not the same throughout the scale, described in proportions or percentages

A

ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

scale of measurement described with measure of central tendency and spread

A

continuous/numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

arithmetic average used for numerical data sensitive to extreme values

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

mean is best for what data

A

numerical, symmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

50th percentile, arrange observations from smallest to largest and determine number of observations, for numerical and ordinal data

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

median is best for what data

A

ordinal or numerical, skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

value occurring most frequently, used for numerical data to describe frequent observations in a large data set

A

mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

mode is best for what data

A

bimodal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

difference between largest and smallest observation

A

range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

identify the 25th and 75th percentile and find the difference to find the

A

interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

measures spread of data around the mean, statistic of interest, can determine skewness

A

standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

a percentage of a distribution that is ≤ a certain number

A

percentiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

measures relative spread in data, used to compared variability across data sets, unitless

A

coefficient of variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

the area between the mean and 1 SD above or below the mean

A

data distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

parametric tests assume data is

A

normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

types of parametric tests

A

student’s t-test, ANOVA, ANCOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

types of nonparametric tests

A

sign test, Wilcoxon rank-sum test, Mann-Whitney test, Kruskal-Wallis test, Spearman’s rank correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

type of test used for non-normally distributed data, and for nominal and ordinal data

A

non-parametric test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

type of distribution that applies only to continuous data achieved by picking subjects randomly, creating equal sample sizes, and obtaining large samples

A

normal (Gaussian)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

type of distribution where data falls to the left or right of the mean

A

skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

graphs for 2 characteristics for continuous data, every point is displayed, and displays exact distribution of the data

A

scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

graph that shows continuous data that involved 5 values
min, 1st quartile, media, 3rd quartile, max

A

box plot (box and whisker plot)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
graph that involves mean and SD data
error bar plot
26
graph where frequency is represented by bars, used with nominal or ordinal data, the measure of interest on the x-axis, number of percentage observations on the y-axis
histogram
27
line graph like a histogram used to compare 2 distributions on the same graph
frequency polygon
28
graph with sum of frequencies accumulated up to a specified boundary
cumulative frequency graph
29
a measure of the relationship between 2 numerical characteristics
correlation coefficient (r)
30
what do correlation coefficients -1, 0, and +1 say about the relationship
-1 = perfect negative linear 0 = no linear +1 = perfect positive linear
31
describes the relationship between two ordinal characteristics, or 1 ordinal and 1 numerical
Spearman's rank correlation
32
compares probabilities of developing an outcome in the presence or absence of a treatment risk factor
relative risk
33
compares odds of developing an outcome on the presence of a risk factor
odds ratio
34
comparison of hazard rates
hazard ratio
35
quantifies likelihood or benefit, # of patients needed to be treated to avoid 1 outcome
number needed to treat
36
refers to risk of detrimental effect, # of patients needed to be treated to cause an adverse event in 1 patient
number needed to harm
37
number of times an event occurred / total number of trials
probability
38
number of times we checked to see if an event would occur
trials
39
number of times an event occurred / number of times an event did not occur
odds
40
large set or collection of items that have something in commone
population
41
subset or population selected to be representative of the population
sample
42
group that the sample is meant to represent
target population
43
population from which a sample was actually drawn
sampled population
44
population values
parameters
45
sample values
statistics
46
characteristics of interest in a study that varies between subjects
variable
47
variables in a study in which the subjects are randomly selected
random variables
48
summary of random variables in a frequency distribution based on probabilities
probability distribution
49
variability among individuals
standard deviation
50
variability among samples, usually approximated with sample SD instead of population SD
standard error
51
provides a single value estimate for something that is a value of interest
point estimates
52
indicates variability of an estimate
confidence interval
53
permits generalizations from a sample to the population from which it came, the likelihood of results obtained in a sample
hypothesis test
54
distribution of individual observations is different than distribution of means (a sampling distribution)
sampling distribution
55
methods used to draw conclusions from a sample and make inferences to the entire population.
inferential statistics
56
2 events that cannot happen at the same time
mutually exclusive p(A or B) = A + B
57
events that have no connection to each other at all
independent events P (A and B) = A x B
58
chance of something happening when you know another event is going to happen
conditional probability p (A|B) = prob of A given B
58
events that can happen at the sane time
non-mutually exclusive events p (A or B) = A + B - (A and B)
59
select sample with subjects having equal probability of being selected
simple random sampling
60
every #th item is selected
systematic random sampling
61
divide population into relevant strata and sample randomly from each stratum
stratified random sampling
62
divide population into clusters and randomly select clusters for inclusion
cluster random sampling
63
probability that a subject is selected is unknown and may reflect selection bias
non-random sampling
64
distribution type for when the event is a binary outcome, gives the probability that a specified event occurs in a given number of independent trials
binomial
65
which values are needed to describe binomial distribution
n and pi
66
distribution used to determine the probability of rare events
poisson
67
distribution type used for continuous and symmetric data used commonly in statistical analysis
gaussian (normal)
68
parameter needed for poisson
λ
69
parameter needed for gaussian
mean and SD
70
what is the mean and SD for normal distribution
mean = 0 SD = 1
71
what effect does change have on statistical significance
increase --> more sig decrease --> less sig
72
what effect does variation have on statistical significance
increase --> less sig decrease --> more sig
73
what effect does sample size have on statistical significance
increase --> more sig decrease --> less sig
74
what is the primary assumption for using t-tests
assumes data is normally distributed
75
error that occurs when we reject a true null hypothesis
type I 5%
76
type of error that occurs when we do not reject a false null hypothesis
type II 20%
77
probability of a type I error
alpha
78
probability of a type II error
beta
79
probability of accepting the null hypothesis when it is indeed true
power
80
probability of obtaining a result as extreme as the one observed through chance alone
p-value
81
which test can increase our ability to detect differences and decreases variability
paired tests
82
test that compares binary outcomes (binomial data)
McNemar test
83
test similar to McNemar but with continuous data
Sign Test
84
test that assumes symmetric distribution of differences around the mean using continuous data, almost as powerful as t-test, calculated by absolute value of change by a pair
Wilcoxon Signed Rank
85
typical alpha value
0.05
86
typical power value
80% (B=0.2)
87
impact on sample size needed for change
larger --> fewer sub smaller --> more sub
88
impact on sample size needed for variation
larger --> more sub smaller --> fewer sub
89
impact on sample size needed for alpha
larger--> fewer sub smaller --> more sub
90
impact on sample size needed for power
larger--> more sub smaller --> few sub