research stats midterm Flashcards

1
Q

what is biostatistics?

A

the statistics of medicine, health sciences and public health

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

define target population

A

larger population to which results will need to be generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

define accessible population

A

actual population of subjects available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

define sample

A

subgroup of accessible population which allows results to be generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

define parameter

A

statistical characteristic of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

define statistic

A

statistical characteristic of sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

define descriptive statistic

A

describes sample shape, central tendency, variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

define inferential satistic

A

used to make inferences about a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

define central tendency

A

the central value
best representative value of target population
single value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

define variability

A

spread of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

define frequency distribution

A

the pattern of frequencies of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

3 measures of central tendency

A

mean - average
median - two equal halves
mode - most frequent score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

describe skewed to the right

A

tail faces right
positive skew
mean > median/mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

describe skewed to the left

A

tail faces left
negative skew
mean < median/mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

when is mean best to use?

A

numeric, symmetric data

not good for skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when is median best to use?

A

skewed data
not effected by extremes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

when is mode best to use?

A

nominal or ordinal
common in surveys

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

advantages to mean

A

easy to calculate and interpret
dont need to arrange values
all values represented
all algebraic formulas possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

disadvantages to mean

A

cant be used with categorical data
cant calculate if data missing
affected by extremes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

advantages to median

A

easy to calculate
not affected by extremes
can be used with ranked data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

disadvantages to median

A

tedious in large data set
problematic with even number of observations
doesnt account for all values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

advantages of mode

A

easy to understand and fine
not affected by extremes
easy to ID in data set and in frequency distribution
mode is useful for categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

disadvantages of mode

A

not defined if no repeats
not based on all values
unstable when data has small number of values
sometimes could have 2+ or no modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

when would you choose median over mode?

A

distribution is skewed
researcher is using ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

define range, percentiles, quartiles

A

R - max-min
P - divides into 100 parts
Q - four parts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

define interquartile range

A

difference between 25th and 75th percentile
used with median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

describe box plot

A

min
1st quartile
median
3rd quartile
max

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

define standard deviation

A

reported same units as raw scores
mean +/- SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

define variance

A

square of SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

coefficient of variation

A

used for interval and ratio data only
expressed as percentage
unitless so good for comparing scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

constant and predictable characteristics

A

68% +/- 1SD
95% +/- 2 SD
99% +/- 3 SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

define a z-score

A

standardized score based on normal distribution
z = SD units

z = score - mean / SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

define sampling error

A

sample mean will not equal the population mean. the difference is called sampling error
how well does the sample represent the population?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

z scores for CI calculations

A

90% = z 1.65
95% = z 1.96
99% = z 2.58

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

central limit theorem

A

will approach mean is N increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

define point estimate

A

single value the is best estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

define confidence interval

A

range of values that we are confident contains parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

how would you increase precision (narrow) in CI?

A

larger sample size
less variance (lower SD)
lower selected level of confidence to 90%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

CI equation

A

CI = mean +/- (z) SEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

define null hypothesis

A

no difference or relationship

will with reject or fail to reject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

define alternative hypothesis

A

is a difference or relationship

42
Q

error: liar or blind

A

type 1: liar, p value
type 2: blind

43
Q

if p value is less than or equal to alpha,

A

reject the null

44
Q

if p value is greater than alpha,

A

fail to reject the null

45
Q

what happens if we fail to reject the null?

A

attribute any observed difference to sampling error only

46
Q

what p value and CI are analogous to each other?

A

95% CI
.05 p value

47
Q

significance of type 1 error

A

mistakenly finding difference
p value tells probability

48
Q

significance of type 2 error

A

mistakenly finding no difference
statistical power = 1-B
power is probability of rejection

49
Q

critical values for two tailed test

A

2.5% of critical region on each side of non critical
nondirectional hypothesis

50
Q

critical values of one tailed test

A

all 5% of critical region on the side hypothesis supports
directional hypothesis

51
Q

which (one or two tailed) is more powerful

A

on tailed

52
Q

define statsical power

A

probability of finding a statistically significant difference if such difference exists in the real world

53
Q

what are the four powers of power?

A

alpha
effect size
variance
sample size

54
Q

best way to increase power

A

increase sample size

55
Q

determinants of statistical power

A

p = power
a = alpha level
n = sample size
e = effect size

56
Q

what is A priori

A

before data collection

57
Q

what is Post hoc

A

after data collection
only an issue of you fail to reject null

58
Q

CI analysis

A

if upper boundary excludes important benefit of treatment, trial is definitively negative

if CI includes important benefit, treatment might still be worthwhile

59
Q

define parametric statistics

A

assumes that sample data comes from population that follows a probability distribution based on a fixed set of parameters

60
Q

what are the 4 assumptions of parametric tests?

A

scale data - ratio or interval
random sampling
equal variance - roughly equivalent before starting
normality - normal distribution

61
Q

what does a t-test do?

A

determines if the difference in sample represent a real difference in the population or is if just sampling error

62
Q

what are examples of two levels of one independent variable?

A

two different groups
one single group with two interventions
one single group with pre and posttest measurements

63
Q

conceptual bias of comparing means

A

sample means will be different
variance comes from two sources
~the IV and everything else

64
Q

conceptual bias with independent groups

A

t= difference between means / variability within groups

65
Q

conceptual bias with repeated measures

A

t = mean of differences between pairs / SD error of the difference scores

66
Q

what if t > 1?

A

you have a greater difference between groups

67
Q

what if t< 1?

A

you have more variability within groups

68
Q

what is the most simple t test equation?

A

t = treatment effect + error / error

69
Q

what are degrees of freedom?

A

the number of independent pieces of information that went into calculating the estimate

number of values that are free to vary

70
Q

independent (unpaired t-test)

A

numerator is difference between group means
denominator represents the variance within groups

71
Q

assumptions for unpaired t-tests

A

data from interval or ratio
samples are randomly drawn from populations
homogeneity of variance - equal variances
population is normally distributed

72
Q

are unequal variances an issue?

A

not a major issue when sample sizes are equal

73
Q

effect size for t-test

A

use cohen’s d

small d = 0.20
medium d = 0.50
large d = 0.80
extra large d = 1.0 or 1.1

74
Q

paire t-test

A

numerator is mean of paired difference scores
denominator is standard error of difference scores

75
Q

3 assumptions for paired t-test

A

data from ratio or interval
samples are randomly drawn from populations
population is normally distributed

76
Q

what is an inappropriate use of multiple t-tests

A

to compare more than 2 means within the same sample
“family wise error”
increase chance of type I error

77
Q

which t test is used for independent groups with one IV

A

independent

78
Q

which t test is used for repeated measures with one IV

A

paired

79
Q

levene’s test

A

for equal variances for independent groups
tests the null: no dig difference in variance between

80
Q

what statistic does the ANOVA use?

A

the F statistic

81
Q

(ANOVA) if variance between samples is small,

A

F will be small

82
Q

(ANOVA) if variance within samples is small,

A

F will be large

83
Q

what is an ANOVA for?

A

compare 3+ groups

84
Q

one way ANOVA

A

one IV with 3+ levels

85
Q

one way repeated measures ANOVA

A

one IV with 3+ levels

86
Q

comparison of group means in ANOVA

A

looks at distance of each group from the grand mean

87
Q

what is the F test called?

A

omnibus test
will tell that a difference exists, but not where

88
Q

what tells where a difference exists?

A

multiple comparison tests

89
Q

ANOVA effect size small

A

eta squared: .01
cohen’s f: .10

90
Q

ANOVA effect size medium

A

eta squared: .06
cohen’s f: .25

91
Q

ANOVA effect size large

A

eta squared: .14
cohen’s f: .40

92
Q

increased power in RM ANOVA

A

less variance

93
Q

define sphericity

A

homogeneity of variance of differences
test with mauchly’s test

94
Q

what is another name for multiple comparison tests?

A

pairwise comparisons

95
Q

describe post hoc MCT

A

performed after ANOVA
most common
test every difference

96
Q

describe planned comparisons MCT

A

instead of ANOVA
focused on specific comparisons

97
Q

what is the goal of MCT

A

decrease family wise error rate

98
Q

what is a solution of of family wise error?

A

bonferroni correction
divide alpha by the number of statistical tests

99
Q

describe fisher’s least significant difference

A

essentially unadjusted t-tests (LSD)
least conservative
most power

100
Q

describe tukey’s honestly significant difference

A

IG only
middle of the road in terms of risk
most common
best balance of type I and II error

101
Q

describe bonferroni t-test

A

divides alpha by # of comparisons
most conservative
high type II error

102
Q

describe sidak

A

RM
adjusted alpha
good balance of type I and II error
most common