Lesson 1-10 Flashcards by KYNA MAE MERILLES

Defining who or what is going to be studied means defining the

population

How well did you know this?

Not at all

Perfectly

is a smaller set or a subset of the population

sample

How well did you know this?

Not at all

Perfectly

occurs when certain members of the population are chosen so that the sample systematically misrepresents the population

biased sample

How well did you know this?

Not at all

Perfectly

must be created where respondents are
listed and assigned a unique number.

sampling frame

How well did you know this?

Not at all

Perfectly

Each subject in the population has the same chance of being selected

Simple random sampling

How well did you know this?

Not at all

Perfectly

The sampling frame is divided into subgroups or strata and simple random samples are
conducted within the strata.

Stratified random sampling

How well did you know this?

Not at all

Perfectly

The sampling frame is ordered, and a number s is selected so that every sth subject is
selected to be in the sample.

Systematic random sampling

How well did you know this?

Not at all

Perfectly

is how information on the subjects will be collected.

Study Designs

How well did you know this?

Not at all

Perfectly

Subjects are identified and followed for a specific period of time.

Prospective study

How well did you know this?

Not at all

Perfectly

a type of medical research used to investigate the causes of disease and to establish links between risk factors and health outcomes.

Cohort study

How well did you know this?

Not at all

Perfectly

An outcome is identified, after the data have already been collected.

Retrospective study

How well did you know this?

Not at all

Perfectly

Study where previously collected
data are reviewed to determine whether any characteristics impacted the outcome.

Retrospective study

How well did you know this?

Not at all

Perfectly

Study where Existing data are then obtained to determine what factors were
related to subjects becoming either a case or a control.

Case control study

How well did you know this?

Not at all

Perfectly

those having the outcome

Case subjects

How well did you know this?

Not at all

Perfectly

those not having the
outcome

control subjects

How well did you know this?

Not at all

Perfectly

Data are collected at a particular time point and represent a cross-section of time.

Cross-sectional study

How well did you know this?

Not at all

Perfectly

Variables whose measurements represent a limited set of possible values.

discrete variables

How well did you know this?

Not at all

Perfectly

values can be expressed in either?

Numbers, characters, words

How well did you know this?

Not at all

Perfectly

These are variables with different levels or categories whose order matters. Examples
include pain scores, stages of cancer, and educational attainment

Ordinal

How well did you know this?

Not at all

Perfectly

These are categorical variables with different levels or categories whose order does
not matter. Examples are tooth color, marital status, and political affiliation.

Nominal

How well did you know this?

Not at all

Perfectly

These are variables that can have only two levels.

Dichotomous

How well did you know this?

Not at all

Perfectly

True or false: Sex is an example of Dichotomous variable

True

How well did you know this?

Not at all

Perfectly

Variables whose measurements represent an unlimited set of possible values.

Continuous

How well did you know this?

Not at all

Perfectly

These variables can take on only positive, whole number values.

Count

How well did you know this?

Not at all

Perfectly

True or false: Continuous variables can have only numeric values.

True

The total number of subjects with a particular category or level

Counts

is simply the count for a category divided by the total number of subjects.

Proportions

is the proportion times 100

Percentages

It provides a description of the average response

measure of center

It provides a description of how varied the responses are

measure of spread

This is commonly used to describe the center of the responses.

Mean

True or false: when extremely large or small values are present, the mean is a better measure of the center.

False, median is a better measure

These are numerical summaries that describe the sample.

Parameters

are the numerical summaries that an investigator wants but cannot obtain directly because collecting data on the entire population is not feasible.

Parameters

These are numerical summaries that describe the sample.

Statistics

What are the the basic sciences of public health.

Epidemiology and biostatistics

is about the understanding of disease development and the methods used to uncover the etiology, progression, and treatment of the disease.

Epidemiology

is collected to investigate a question

Information (data)

variable consists of a summary of the possible values the variable can have and the number of subjects with each of these values.

distribution

distribution that uses counts to describe the number of subjects with a particular value

frequency distribution

distribution that uses proportions to describe the number of the subjects with a particular value

probability distribution

Two types of graphs are used to summarize categorical variables

pie charts and bar graphs.

can be presented using frequencies or proportions

Pie charts

describes how the pieces relate to the whole

Pie charts

They demonstrate how the categories within a variable relate to each other

Pie charts

are used to describe the distributions of categorical variables.

Bar graphs

are used when a data has a variable with two options.

Binomial distributions

Binomial distributions are what type of variables

dichotomous

best describe the distribution of a continuous variable

Histograms

is a graphical representation of a variable in which the observed values are categorized, a bar is drawn for each category, and the number of participants in each category is represented by the height of the bar.

Histograms

It provides a quick picture of the distribution of a variable and it can be presented with counts or proportions of participants.

Histograms

They provide information about how spread out the responses are, which responses are common, which responses are in the center, and the overall shape of the distribution.

Histograms

can be folded in half so that each half is close to a mirror image of the other

Symmetric distributions

This distribution has one mode or one most common value

unimodal

A distribution with two peaks can be

bimodal

When the histogram is bell-shaped, unimodal, and symmetric, with the mean, median, and most common value at the center at the peak, the data come from a _____

normal distribution.

can be used to determine if observations are common or extreme

empirical rule

normal distribution is ___ skewed when the distribution has a tail that extends longer to the left, that is, there is a set of observations with lower values than those of the majority of the observed responses.

left

A distribution is ___ skewed when the distribution has a tail that extends longer to the right, that is, there is a set of observations with higher values than those of the majority of the observed responses.

right

is a discrete probability distribution whose possible values are whole numbers from 0 to infinity.

Poisson distribution

are percentages of all the observations that are less than the value of interest.

Percentiles

It is used to determine whether a particular value is common or rare.

Percentiles

measurements occurs when multiple measurements are taken on the subject.

Variability

If there is little measurement variability, the measurement has?

reliability

The idea that samples may be different

sampling variability

The value of the statistics and the number of times the statistics occur from all the possible samples is known as the?

distribution of samples or the sampling distribution

It provides a description of all possible statistics obtained from samples

sampling distribution

is the characterization of all sample means

central limit theorem

According to this theorem, the distribution of the means obtained from all possible samples will result in a normally shaped distribution, in which the center of the distribution is the true parameter and one standard deviation of the sampling distribution is the standard error of the mean.

central limit theorem

This theorem holds true for large sample size.

central limit theorem

is a basic and commonly used type of predictive analysis.

Linear regression

It may be called an outcome variable, criterion variable, endogenous variable, or regressand.

dependent variable

It can be called exogenous variables, predictor variables, or regressors

independent variables

is the portion of the total variation in the dependent variable that is explained by variation in the independent variable

Coefficient of Determination

is often useful to attempt to represent data with the equation of a straight line in order to predict values that may not be displayed on the plot.

line of best fit

determined by the correlation between the two variables on a scatter plot

line of best fit

is a statistical technique that can show whether and how strongly pairs of variables are related.

Correlation

If the correlation is greater than 0, then the variables are

positively correlated.

If the correlation is less than 0, then the variables are said to be

negatively correlated

If the correlation is exactly 0, such as for birthweight and birthday, then the variables are said to be

uncorrelated

exists when high scores in one variable are associated with high scores in the second variable or low scores in one variable are associated with low scores in the other

POSITIVE CORRELATION

exists when high scores in one variable are associated with low scores in the second or vice versa.

NEGATIVE CORRELATION

exists when the points on the scatter diagram are spread in a random manner

ZERO CORRELATION

all points lie on a straight line

PERFECT CORRELATION

True or false: A key thing to remember when working with correlations is never to assume a correlation means that a change in one variable causes a change in another

True

It seeks to find the relationship between two variables.

Correlation

is commonly used for testing relationships between categorical variables.

Chi Square statistic

The _______ of the Chi-Square test is that no relationship exists on the categorical variables in the population; they are independent.

null hypothesis

The Chi-Square statistic is most commonly used to evaluate _________ when using a crosstabulation (also known as a bivariate table).

Tests of Independence

________ presents the distributions of two categorical variables simultaneously, with the intersections of the categories of the variables appearing in the cells of the table.

Crosstabulation

The ___________ assesses whether an association exists between the two variables by comparing the observed pattern of responses in the cells to the pattern that would be expected if the variables were truly independent of each other

Test of Independence

Is student status (in-state versus out-of-state) associated with one’s eventual graduation outcome (graduating versus not graduating)? Answer: Chi-Square test of _____ _ ________

Independence

To test a theory that people have no preference among four different outdoor activities, you ask 100 people to select among jogging, bicycling, hiking, or swimming. Answer: Chi-Square test of _____ _ ________

Goodness of fit

A biostatistician would like to determine if the ratio of the blood type in the storage for transfusions should be different in Hawaii from the main land. She collected a sample of blood types of 10,000 people in Hawaii and that of 100,000 people in the mainland. She wishes to see if the breakdown of blood types (A, B, AB and 0) is the same for both populations. Answer: Chi-Square test of _____ _ ________

Homogeneity

A researcher wants to determine if scoring high or low on an artistic ability test depends on being right or left-handed. Answer: Chi-Square test of _____ _ ________

Independence

A national organization wants to compare the distribution of level of highest education completed (high school, college, masters, doctoral) for Republicans versus Democrats. Answer: Chi-Square test of _____ _ ________

Goodness of fits

A preservation society has the percentages of five main types of fish in the river from 10 years ago. After noticing an imbalance recently, they add some fish from hatcheries to the river. How can they determine if they restored the ecosystem from a new sample of fish? Answer: Chi-Square test of _____ _ ________

Goodness of fit

is a way to find out if survey or experiment results are significant. In other words, they help you to figure out if you need to reject the null hypothesis or accept the alternate hypothesis

ANOVA test

is used to compare two means from two independent (unrelated) groups using the F-distribution

one way ANOVA

null hypothesis for the test one way ANOVA is that the ______

two means are equal

True or false: one way ANOVA will tell you that at least two groups were different from each other And which groups were different.

False, it won’t tell you which groups were different

If the computed F value is greater than the tabulated F value, then the null hypothesis is

rejected

If the computed F value is less than the tabulated F value, then the null hypothesis is

accepted

is used when the research question involves the comparisons of means from more than two independent groups.

ANOVA

It provides a statistical test for determining whether there is enough evidence to reject the null hypothesis that all the means are equal.

ANOVA

It is the probability of the occurrence of a disease or other health outcome of interest during a specified period, usually one year

Risk

is calculated by dividing the number who got the disease during the defined period by the total population of interest during that period.

Risk

is the calculated ratio of incidence rates of a health condition or outcome in two groups of people, those exposed to a factor of interest and those not exposed.

Relative risk

used to determine if exposure to a specific risk factor is associated with an increase, decrease, or no change in the disease or outcome rate when compared to those without the exposure.

Relative risk

is a statistical measure of the strength of the association between a risk factor and an outcome.

Relative risk

fundamental comparison of rates using a ratio in epidemiology is known as the

rate ratio

rates being compared are incidence rates, epidemiologists call those comparisons ____

risk ratios

risk ratios is also referred to as

relative risk (RR)

is a measure of association that provided the strength of association between exposure and outcome in a population

relative risk

True or false: Relative risk is not a flexible tool.

False

When the relative risk is above 1, the interpretation is that those in the exposed group are __________ the outcome than those in the nonexposed group

more likely to have

The larger the number, the _______ the relationship between being exposed and having the outcome.

stronger

Relative Risk = 1

Null value; No relationship exists

Relative Risk > 1

Positive association; more likely to have the outcome

Relative Risk < 1

Negative association; less likely to have the outcome

is a measure of association that provides strength of association between exposure and outcome in a population.

RELATIVE RISK

is a measure association that provides the strength and direction of the association between exposure and outcome in a population.

odds ratio

odds ratio greater than 1 indicates a ______ between exposure and outcome

positive association

odds ratio less than 1 indicates a _____ between exposure and outcome.

negative association

odds in those with the outcome to the exposure odds in those without outcome

Exposure

odds in those with exposure to the outcome odds in those without exposure.

Outcome

first way that the odds ratio can be calculated

Exposure Odds Ratio

Formula for exposure OR

𝑎/𝑐 𝑏/𝑑

Formula for Outcome OR

𝑎𝑑/𝑏𝑐

second way that the odds ratio can be calculated

Outcome Odds Ratio

measure of association that provides strength and direction of the association between existing exposure and outcome in the population.

Prevalence Ratio

a measure of association between exposure and outcome, provides strength and direction using two incidence densities

Incidence Density Ratio

a measure association that provides the strength and direction of the association between exposure and outcome in a population.

Odds ratio

is another tool used for testing population mean when the variance is unknown and/or the sample size is small (n < 30).

T-test

is used to test the hypothesis involving the mean of a study.

T-test

Lesson 1-10 Flashcards

(135 cards)