Week 1&2 (Descriptive/Foundations/Experimental Designs/Comparing 2 Means/Inferential Tables/Statistical Software) Flashcards

1
Q

What are the types of biostatistics

A

descriptive statistics
probability
estimate population parameters
hypothesis testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of population

A

target and accessible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

target population definition

A

the LARGER population to which results of a study will be generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

accessible population definition

A

the ACTUAL population of subjects available to be chosen for a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sample definition

A

a subgroup of the population of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

parameter

A

statistical characteristics of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

statistic

A

statistical characteristic of sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

descriptive statistic

A

used to describe a sample shape, central tendency, and variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

inferential statistic

A

used to make inferences about a population (t-test, ANOVA, Pearsons R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

measures of central tendency

A

mean, median, and mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is central tendency

A

central value, BEST representative value of the target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is variability

A

the “spread” of the data
small: spike like
large: wave

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

frequency definition

A

the number of times a value appears in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

frequency distribution

A

the pattern of frequencies of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

methods of displaying frequency distributions

A

histogram & stem and leaf plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

skewed to the left (image)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

skewed to the right (image)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

normal “skewed” (image)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

different shapes of distributions

A

normal (B)
skewed to right (A)
skewed to left (C)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Skewed to right (words)

A

“tail” faces right not where the bulk of the curve lies
AKA “positive skew”
mean > median/mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Skewed to left (words)

A

“tail” faces left
AKA “negative skew”
mean < median/mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Measures of Central Tendency: best choice for MEAN

A

best choice for numberic
(not good for skewed data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Measures of Central Tendency: best choice for MEDIAN

A

best for non-symmetrical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Measures of Central Tendency: best choice for MODE

A

limited utility; nominal or ordinal data
common in surveys

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Mean: Advantages

A

easy, don’t have to arrange in order, all formulas are possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Mean: Disadvantages

A

can’t be used with categorical data, affected by extreme values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Median: advantages

A

easy, can be used with “ranked” data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Median: disadvantages

A

tedious in a large data set
should be used with ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

mode: advantages

A

easy to understand and calculate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Mode: disadvantages

A

not based on all values
unstable when the data consist of a small number of values
sometimes the data has 2+ modes or no modes at all

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

common measures of variability

A

range, interquartile range, standard deviation, variance, coefficient of variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

range

A

difference between highest and lowest score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

percentiles of range

A

a score’s position within the distribution (divides into 100 parts)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

quartiles of range

A

divides distribution into 4 equal parts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

interquartile range (IQR)

A

difference between 25th and 75th percentile
often used with median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is a box plot?

A

five-number summary of data set
(minimum, 1st quartile, median, 3rd quartile)
box = interquartile range
horizontal line at median
“whiskers” = minimum and maximum scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

coefficient of variation

A

used for interval and ratio data only
unitless
helpful comparing variability between two distributions on different scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

what shape is normal distribution?

A

bell-shaped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

constant and predictable characteristics of normal distribution

A

68% of scores are 1 SD of the mean
95% of scores are 2 SD of the mean
99% of scores are 3SD of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

z-scores

A

a standardized score based on the normal distribution
allows for the interpretation of a single score in relation to the distribution of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

probability definition

A

the likelihood that any one event will occur, given all the possible outcomes
“what is likely to happen”

42
Q

sampling error

A

difference between sample mean and population mean

43
Q

what is sampling error measured by

A

standard error of the mean (SEM)

44
Q

standard error of the mean equation

A

SEM = SD / square root of sample size (n)

45
Q

what happens to the SEM if we increase our sample size?

A

decrease in error

46
Q

What happens to the SEM if we increase our standard deviation?

A

increase in error

47
Q

what is the standard error of the mean

A

allows us to estimate population parameters

48
Q

90% SEM = z-score of what

A

1.65

49
Q

95% SEM = z-score of what

A

1.96

50
Q

99% SEM = z-score of what

A

2.58

51
Q

point estimate

A

a single value that represents the best estimate of the population value

52
Q

confidence interval

A

a range of values that we are confident contains the population parameters

53
Q

increased precision (narrowed) by…

A

larger sample size
less variance (lower s)
lower selected level of confidence (90% vs 95%)

54
Q

null hypothesis means

A

there is no difference

55
Q

type I error

A

alpha
“liar”
we say there is a difference but there is no difference
reject the null but the null is true

56
Q

type II error

A

beta
“blind”
we say there is no difference but there is a difference
do not reject the null but the null is false

57
Q

normal value of alpha

A

.05

58
Q

p-value

A

probability of type 1 error, if the null hypothesis is true

59
Q

if p-value < a

A

reject the null

60
Q

if p-value > a

A

Accept the null

61
Q

if we “fail to reject” the null, we attribute any observed difference to

A

sampling error only

62
Q

if a confidence interval does not have 0 it means

A

there is a real difference

63
Q

if a confidence interval does have 0 it means

A

there is no difference

64
Q

mistakenly finding a difference

A

false-positive

65
Q

mistakenly finding no difference

A

false-negative

66
Q

statistical power formula

A

1 - beta

67
Q

critical values for a two-tailed test

A

+or- 1.96

68
Q

one-tailed test is for

A

directional hypothesis

69
Q

two-tailed test is for

A

nondirectional hypothesis

70
Q

statistical power

A

the probability of finding a statistical significant difference if such a difference exists in the real world
the probability that the test correctly rejects the null hypothesis

71
Q

four pillars of power

A

alpha, effect size, variance, sample size

72
Q

to increase power

A

higher alpha, large effect size, LOW variance, large sample size

73
Q

decreased power

A

lower alpha, small effect size, HIGHER variance, smaller sample size

74
Q

determinants of statistical power

A

Power (1-B), Alpha level of significance, N (sample size), Effect size
PANE

75
Q

A priori

A

before data collection
before study

76
Q

Post hoc

A

after data collection
after study

77
Q

A priori analysis standard effect sizes: small

A

d = .20

78
Q

A priori analysis standard effect sizes: medium

A

d = .50

79
Q

A priori analysis standard effect sizes: large

A

d = .80

80
Q

True experimental design

A

RCT = gold standard
IV manipulated by researcher
at least 2 groups
randomly assigned

81
Q

Quasi-experimental designs

A

may lack randomization
may lack comparison group
may lack both

82
Q

does a posttest-only control group give us all the information we need?

A

no we dont have all info

83
Q

same people in each level of the IV

A

within-subject design

84
Q

single-factor (one-way) repeated measures design

A

no control group, subjects act as their own controls

85
Q

examples of parametric statistics tests

A

t-tests, ANOVA, ANCOVA, Correlation, Regression

86
Q

Assumptions of Parametric Test

A

scale data (ratio or interval), random sampling, equal variance, normality

87
Q

t-test

A

comparing 2 means
2 different groups

88
Q

variance (differences) comes from 2 sources:

A

IV and everything else (error variance)

89
Q

comparing means for INDEPENDENT groups

A

difference between means / variability within groups

90
Q

comparing means for REPEATED measures

A

mean of differences between pairs / Std error of the difference scores

91
Q

if t > 1 then

A

you have a greater difference between groups

92
Q

if t < 1 then

A

you have more variability within groups

93
Q

comparing means formula:

A

t = (treatment effect + error) / error

94
Q

degrees of freedom definition

A

the number of independent pieces of information that went into calculating the estimate

95
Q

degrees of freedom equation

A

df = n - 1

96
Q

assumptions of unpaired t-tests

A

data from ratio or interval scales
samples are randomly drawn from populations
homogeneity of variance
population is normally distributed

97
Q

Effect size for t-test

A

the measure of effect the IV has on the DV

98
Q

effect size: small

A

d = .20

99
Q

effect size: medium

A

d = .50

100
Q

effect size: large

A

d = .80

101
Q

assumptions of paired t-tests

A

data from ratio or interval scales
samples are randomly drawn from populations
population is normally distributed

102
Q

what is the best way to DECREASE the width of the CI?

A

decrease the percentage associated with the confidence interval