Lesson 1-10 Flashcards

1
Q

Defining who or what is going to be studied means defining the

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

is a smaller set or a subset of the population

A

sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

occurs when certain members of the population are chosen so that the sample systematically misrepresents the population

A

biased sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

must be created where respondents are
listed and assigned a unique number.

A

sampling frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Each subject in the population has the same chance of being selected

A

Simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The sampling frame is divided into subgroups or strata and simple random samples are
conducted within the strata.

A

Stratified random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The sampling frame is ordered, and a number s is selected so that every sth subject is
selected to be in the sample.

A

Systematic random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

is how information on the subjects will be collected.

A

Study Designs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Subjects are identified and followed for a specific period of time.

A

Prospective study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

a type of medical research used to investigate the causes of disease and to establish links between risk factors and health outcomes.

A

Cohort study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

An outcome is identified, after the data have already been collected.

A

Retrospective study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Study where previously collected
data are reviewed to determine whether any characteristics impacted the outcome.

A

Retrospective study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Study where Existing data are then obtained to determine what factors were
related to subjects becoming either a case or a control.

A

Case control study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

those having the outcome

A

Case subjects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

those not having the
outcome

A

control subjects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data are collected at a particular time point and represent a cross-section of time.

A

Cross-sectional study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Variables whose measurements represent a limited set of possible values.

A

discrete variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

values can be expressed in either?

A

Numbers, characters, words

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

These are variables with different levels or categories whose order matters. Examples
include pain scores, stages of cancer, and educational attainment

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

These are categorical variables with different levels or categories whose order does
not matter. Examples are tooth color, marital status, and political affiliation.

A

Nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

These are variables that can have only two levels.

A

Dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

True or false: Sex is an example of Dichotomous variable

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Variables whose measurements represent an unlimited set of possible values.

A

Continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

These variables can take on only positive, whole number values.

A

Count

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
True or false: Continuous variables can have only numeric values.
True
26
The total number of subjects with a particular category or level
Counts
27
is simply the count for a category divided by the total number of subjects.
Proportions
28
is the proportion times 100
Percentages
29
It provides a description of the average response
measure of center
30
It provides a description of how varied the responses are
measure of spread
31
This is commonly used to describe the center of the responses.
Mean
32
True or false: when extremely large or small values are present, the mean is a better measure of the center.
False, median is a better measure
33
These are numerical summaries that describe the sample.
Parameters
34
are the numerical summaries that an investigator wants but cannot obtain directly because collecting data on the entire population is not feasible.
Parameters
35
These are numerical summaries that describe the sample.
Statistics
36
What are the the basic sciences of public health.
Epidemiology and biostatistics
37
is about the understanding of disease development and the methods used to uncover the etiology, progression, and treatment of the disease.
Epidemiology
38
is collected to investigate a question
Information (data)
39
variable consists of a summary of the possible values the variable can have and the number of subjects with each of these values.
distribution
40
distribution that uses counts to describe the number of subjects with a particular value
frequency distribution
41
distribution that uses proportions to describe the number of the subjects with a particular value
probability distribution
42
Two types of graphs are used to summarize categorical variables
pie charts and bar graphs.
43
can be presented using frequencies or proportions
Pie charts
44
describes how the pieces relate to the whole
Pie charts
45
They demonstrate how the categories within a variable relate to each other
Pie charts
46
are used to describe the distributions of categorical variables.
Bar graphs
47
are used when a data has a variable with two options.
Binomial distributions
48
Binomial distributions are what type of variables
dichotomous
49
best describe the distribution of a continuous variable
Histograms
50
is a graphical representation of a variable in which the observed values are categorized, a bar is drawn for each category, and the number of participants in each category is represented by the height of the bar.
Histograms
51
It provides a quick picture of the distribution of a variable and it can be presented with counts or proportions of participants.
Histograms
52
They provide information about how spread out the responses are, which responses are common, which responses are in the center, and the overall shape of the distribution.
Histograms
53
can be folded in half so that each half is close to a mirror image of the other
Symmetric distributions
54
This distribution has one mode or one most common value
unimodal
55
A distribution with two peaks can be
bimodal
56
When the histogram is bell-shaped, unimodal, and symmetric, with the mean, median, and most common value at the center at the peak, the data come from a _____
normal distribution.
57
can be used to determine if observations are common or extreme
empirical rule
58
normal distribution is ___ skewed when the distribution has a tail that extends longer to the left, that is, there is a set of observations with lower values than those of the majority of the observed responses.
left
59
A distribution is ___ skewed when the distribution has a tail that extends longer to the right, that is, there is a set of observations with higher values than those of the majority of the observed responses.
right
60
is a discrete probability distribution whose possible values are whole numbers from 0 to infinity.
Poisson distribution
61
are percentages of all the observations that are less than the value of interest.
Percentiles
62
It is used to determine whether a particular value is common or rare.
Percentiles
63
measurements occurs when multiple measurements are taken on the subject.
Variability
64
If there is little measurement variability, the measurement has?
reliability
65
The idea that samples may be different
sampling variability
66
The value of the statistics and the number of times the statistics occur from all the possible samples is known as the?
distribution of samples or the sampling distribution
67
It provides a description of all possible statistics obtained from samples
sampling distribution
68
is the characterization of all sample means
central limit theorem
69
According to this theorem, the distribution of the means obtained from all possible samples will result in a normally shaped distribution, in which the center of the distribution is the true parameter and one standard deviation of the sampling distribution is the standard error of the mean.
central limit theorem
70
This theorem holds true for large sample size.
central limit theorem
71
is a basic and commonly used type of predictive analysis.
Linear regression
72
It may be called an outcome variable, criterion variable, endogenous variable, or regressand.
dependent variable
73
It can be called exogenous variables, predictor variables, or regressors
independent variables
74
is the portion of the total variation in the dependent variable that is explained by variation in the independent variable
Coefficient of Determination
75
is often useful to attempt to represent data with the equation of a straight line in order to predict values that may not be displayed on the plot.
line of best fit
76
determined by the correlation between the two variables on a scatter plot
line of best fit
77
is a statistical technique that can show whether and how strongly pairs of variables are related.
Correlation
78
If the correlation is greater than 0, then the variables are
positively correlated.
79
If the correlation is less than 0, then the variables are said to be
negatively correlated
80
If the correlation is exactly 0, such as for birthweight and birthday, then the variables are said to be
uncorrelated
81
exists when high scores in one variable are associated with high scores in the second variable or low scores in one variable are associated with low scores in the other
POSITIVE CORRELATION
82
exists when high scores in one variable are associated with low scores in the second or vice versa.
NEGATIVE CORRELATION
83
exists when the points on the scatter diagram are spread in a random manner
ZERO CORRELATION
84
all points lie on a straight line
PERFECT CORRELATION
85
True or false: A key thing to remember when working with correlations is never to assume a correlation means that a change in one variable causes a change in another
True
86
It seeks to find the relationship between two variables.
Correlation
87
is commonly used for testing relationships between categorical variables.
Chi Square statistic
88
The _______ of the Chi-Square test is that no relationship exists on the categorical variables in the population; they are independent.
null hypothesis
89
The Chi-Square statistic is most commonly used to evaluate _________ when using a crosstabulation (also known as a bivariate table).
Tests of Independence
90
________ presents the distributions of two categorical variables simultaneously, with the intersections of the categories of the variables appearing in the cells of the table.
Crosstabulation
91
The ___________ assesses whether an association exists between the two variables by comparing the observed pattern of responses in the cells to the pattern that would be expected if the variables were truly independent of each other
Test of Independence
92
Is student status (in-state versus out-of-state) associated with one’s eventual graduation outcome (graduating versus not graduating)? Answer: Chi-Square test of _____ _ ________
Independence
93
To test a theory that people have no preference among four different outdoor activities, you ask 100 people to select among jogging, bicycling, hiking, or swimming. Answer: Chi-Square test of _____ _ ________
Goodness of fit
94
A biostatistician would like to determine if the ratio of the blood type in the storage for transfusions should be different in Hawaii from the main land. She collected a sample of blood types of 10,000 people in Hawaii and that of 100,000 people in the mainland. She wishes to see if the breakdown of blood types (A, B, AB and 0) is the same for both populations. Answer: Chi-Square test of _____ _ ________
Homogeneity
95
A researcher wants to determine if scoring high or low on an artistic ability test depends on being right or left-handed. Answer: Chi-Square test of _____ _ ________
Independence
96
A national organization wants to compare the distribution of level of highest education completed (high school, college, masters, doctoral) for Republicans versus Democrats. Answer: Chi-Square test of _____ _ ________
Goodness of fits
97
A preservation society has the percentages of five main types of fish in the river from 10 years ago. After noticing an imbalance recently, they add some fish from hatcheries to the river. How can they determine if they restored the ecosystem from a new sample of fish? Answer: Chi-Square test of _____ _ ________
Goodness of fit
98
is a way to find out if survey or experiment results are significant. In other words, they help you to figure out if you need to reject the null hypothesis or accept the alternate hypothesis
ANOVA test
99
is used to compare two means from two independent (unrelated) groups using the F-distribution
one way ANOVA
100
null hypothesis for the test one way ANOVA is that the ______
two means are equal
101
True or false: one way ANOVA will tell you that at least two groups were different from each other And which groups were different.
False, it won’t tell you which groups were different
102
If the computed F value is greater than the tabulated F value, then the null hypothesis is
rejected
103
If the computed F value is less than the tabulated F value, then the null hypothesis is
accepted
104
is used when the research question involves the comparisons of means from more than two independent groups.
ANOVA
105
It provides a statistical test for determining whether there is enough evidence to reject the null hypothesis that all the means are equal.
ANOVA
106
It is the probability of the occurrence of a disease or other health outcome of interest during a specified period, usually one year
Risk
107
is calculated by dividing the number who got the disease during the defined period by the total population of interest during that period.
Risk
108
is the calculated ratio of incidence rates of a health condition or outcome in two groups of people, those exposed to a factor of interest and those not exposed.
Relative risk
109
used to determine if exposure to a specific risk factor is associated with an increase, decrease, or no change in the disease or outcome rate when compared to those without the exposure.
Relative risk
110
is a statistical measure of the strength of the association between a risk factor and an outcome.
Relative risk
111
fundamental comparison of rates using a ratio in epidemiology is known as the
rate ratio
112
rates being compared are incidence rates, epidemiologists call those comparisons ____
risk ratios
113
risk ratios is also referred to as
relative risk (RR)
114
is a measure of association that provided the strength of association between exposure and outcome in a population
relative risk
115
True or false: Relative risk is not a flexible tool.
False
116
When the relative risk is above 1, the interpretation is that those in the exposed group are __________ the outcome than those in the nonexposed group
more likely to have
117
The larger the number, the _______ the relationship between being exposed and having the outcome.
stronger
118
Relative Risk = 1
Null value; No relationship exists
119
Relative Risk > 1
Positive association; more likely to have the outcome
120
Relative Risk < 1
Negative association; less likely to have the outcome
121
is a measure of association that provides strength of association between exposure and outcome in a population.
RELATIVE RISK
122
is a measure association that provides the strength and direction of the association between exposure and outcome in a population.
odds ratio
123
odds ratio greater than 1 indicates a ______ between exposure and outcome
positive association
124
odds ratio less than 1 indicates a _____ between exposure and outcome.
negative association
125
odds in those with the outcome to the exposure odds in those without outcome
Exposure
126
odds in those with exposure to the outcome odds in those without exposure.
Outcome
127
first way that the odds ratio can be calculated
Exposure Odds Ratio
128
Formula for exposure OR
π‘Ž/𝑐 𝑏/𝑑
129
Formula for Outcome OR
π‘Žπ‘‘/𝑏𝑐
130
second way that the odds ratio can be calculated
Outcome Odds Ratio
131
measure of association that provides strength and direction of the association between existing exposure and outcome in the population.
Prevalence Ratio
132
a measure of association between exposure and outcome, provides strength and direction using two incidence densities
Incidence Density Ratio
133
a measure association that provides the strength and direction of the association between exposure and outcome in a population.
Odds ratio
134
is another tool used for testing population mean when the variance is unknown and/or the sample size is small (n < 30).
T-test
135
is used to test the hypothesis involving the mean of a study.
T-test