Midterm 1 Flashcards

1
Q

Statistics

A

science of collecting, organizing, and analyzing data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What do biostatisticians look to achieve?

A

attempt to gain insight and draw conclusions using data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Can stats lie?

A

No but they can be wrong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some ways to chart categorical data?

A

bar graphs and pie charts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some ways to chart quantitative data?

A

histograms and scatterplots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some methods to organize and summarize raw data?

A

Graphically, numerically, and exploratory data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variable?

A

Characteristic of an individual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Classifying variables?

A

Questions to ask when designing or reviewing an experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Categorical variable?

A

individual placed in a category-arithmetic operations cannot be applied to these data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Quantitative variable?

A

things that arithmetic operations can be performed on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does a pie chart represent?

A

How one categorical variable breaks down into components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a bar graph represent?

A

Each characteristic is represented by a bar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does a histogram represent?

A

Summary graph from a single variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does a dot plot represent?

A

Raw data. Used to describe patterns in variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does a time plot represent?

A

Horizontal Variable (time). Changes in line between points show a change in time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the vertical axis represent in a histogram?

A

Frequency or relative frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an extreme point known as?

A

Outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the mean?

A

Measures of location or measures of central tendency –
measuring center.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the median?

A

midpoint of the distribution such that half of the
numbers are smaller and the other half are larger

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the median if n is even?

A

mean of centre two numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the mode?

A

the most common or frequent value - a list can have more than
one mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Is the median resistent to outliers?

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Is the mean resistant to outliers?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Quartiles?

A

Quartiles mark the mid point between the lower observation
and median and the median and the upper observation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the five number summary?

A

Lowest number, Q1, median, Q3, and largest number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is a graph with the five number summary?

A

Box plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is interquartile range?

A

Distance between first and third quartiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the standard deviation?

A

Measures variation around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How do you organize a statistical problem?

A

State, plan, solve, conclude

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is a density curve?

A

Line drawn through historgam

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Is a density curve generalizable?

A

Yes it ignores outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What do bars of histograms represent?

A

Area

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the area under of density curve always equal to?

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Median of the density curve?

A

the point where half the
observations lie above and half below – point where there
are equal areas left and right of median line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Mean of the density curve?

A

the balance point of the curve if it were made out of
a solid material

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What greek letters represent mean and standard deviation?

A

meuw (mean) and sigma (standard deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are Normal distributions (curves)?

A

Bell-shaped curves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Why are Normal distributions important?

A

1) Good descriptions for some distributions of real data.
* 2) Good approximation to many chance outcomes.
* 3) Many statistical inference procedures based on the
Normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is the distance of 1 deviation on a bell curve?

A

The point of which the curvature changes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is the 68-95-99.7 rule?

A

About 68% of all observations
are within 1 standard
deviation (σ) of the mean (μ).
* About 95% of all observations
are within 2 σ of the mean μ.
* Almost all (99.7%)
observations are within 3 σ of
the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is the shorthand for distribution curves?

A

N(mean, standard deviation)
(standardization)

42
Q

What does the z score represent?

A

indicates how far the observation falls from the
mean and the direction. How many standard deviations away?

43
Q

How are x and z related?

A

When x is larger than the mean, z is positive.
* When x is smaller than the mean, z is negative

44
Q

What does the Standard normal table show?

A

Area under standard normal curve to the LEFT of the z value

45
Q

What is cumulative proportion?

A

proportion of observations that lie at or below x

46
Q

What is a normal quartile plot?

A

Z values on x axis and regular values on y axis. Use technology to obtain these. Help to see trend of data.

47
Q

Response variable?

A

dependent variable (y axis)

48
Q

Explanatory Variable?

A

independent variable (x axis)

49
Q

Bivariate Data

A

relationship between two variables

50
Q

What is a common way to visualize the relationship between 2 variables?

A

Scatter plot (2 dimensions)

51
Q

Where do either of the variables go on a scatter plot axis?

A

Response variable on vertical axis
Explanatory variable on horizontal axis

52
Q

What three factors are there to look for in a scatter plot?

A

Form, direction, and strength (and outliers)

53
Q

What measurement is important for strength and direction on a scatterplot?

A

Correlation coefficient

54
Q

What is strength of a scatterplot dependent on?

A

The scale of the axis

55
Q

What does r represent in a scatterplot?

A

+/- means direction and closest to 1 means a strong correlation

56
Q

Facts about r (correlation).

A

1) Correlation does not distinguish between explanatory and
response variables.
2) Both variables need to be quantitative.
3) r has no unit of measurement so for any given data set,
when the units of measure change, r does not.
4) Positive r indicates positive association between the
variables; Negative r indicates negative association.
5) r is always a number between -1 and +1. Values near 0
indicate a poor relationship. 1 or -1 indicate a perfect linear
relationship.
6) r is not resistant - greatly affected by outliers - use with
caution with outliers.
7) r only measures strength of linear relationships - not
curved relationships.

57
Q

What is the linear line used in scatterplots known as?

A

Regression line

58
Q

What does a regression line explain?

A

How y changes in terms of x

59
Q

What method is used to have the best-fit regression line?

A

Least-squares method

60
Q

What is the least squares regression line?

A

Line where the vertical distance of the data is at a minimum

61
Q

What does a slope in a regression line represent?

A

Rate of Change

62
Q

What does an intercept in a regression line represent?

A

the value of a when x =0

63
Q

Should you use a regression line with an extreme outlier?

A

No

64
Q

What is the coefficient of determination?

A

Correlation coefficient squared r^2

65
Q

What does r^2 represent?

A

the fraction of variance in y that can be explained by the regression model

66
Q

What are residuals?

A

Shows how far data stray from the regression line

67
Q

What are vertical lines to regression line called?

A

Residuals

68
Q

What does the +/- with residuals indicate?

A

– Residual is positive if it lies above
the regression line.
– Residual is negative if it lies below
the regression line

69
Q

What is a residual plot?

A

When the regression line lies horizontal to be able to compare residuals

70
Q

What is an influential individual?

A

An outlier who if removed changes the regression line significantly

71
Q

Extrapolation?

A

Expanding your data set. Do not do this!

72
Q

Lurking variable?

A

a variable that has an important
effect on the relationship but is not among
the variables studied.

73
Q

Observational study?

A

observing natural events. Confound lurking variables.

74
Q

Experiment?

A

observation + manipulation of variables. Cause and effect relationship

75
Q

What is a sample?

A

The part of the population we actually examine
and for which we do have data

76
Q

What is probability sampling?

A

individuals or units are randomly
selected; the sampling process is unbiased

77
Q

What is convenience sampling?

A

individuals or units are randomly
selected; the sampling process is unbiased

78
Q

What is single random sampling?

A

Everyone has a chance of being selected equally

79
Q

What is a probability sample?

A

a sample chosen by chance

80
Q

Stratified random sampling?

A

population divided into
groups of similar individuals called strata.

81
Q

What is interference?

A

using the sample to infer something about the
population

82
Q

What are cohort studies?

A

enlist individuals of common demographic and
keep track of them over a long period of time (“prospective”).
Individuals who later develop a condition are compared to those
who don’t develop the condition

83
Q

What are case-control studies?

A

start with 2 random samples of individuals
with different outcomes, and look for exposure factors in the
subjects’ past (“retrospective”)

84
Q

What is an experimental unit?

A

Individuals of which an experiment is done on

85
Q

What is a factor?

A

Explanatory variable (independent variable)

86
Q

What is a treatment?

A

specific experimental condition

87
Q

What is a confounding factor?

A

an explanatory (independent) variable
that affects or distorts the relationship between another
explanatory variable and its’ response (dependent) variable
since it is related to both

88
Q

What is a control group?

A

A treatment to which the other treatments are compared to
eliminate the effects of lurking variables on the
experimental outcome.

89
Q

What is a placebo?

A

Fake experimental variable is given to the control group. Helps to make the experiment double blind.

90
Q

What do randomized comparative experiments use?

A

Comparison and randomization

91
Q

Why are Randomized comparative experiments considered the best designed experiments?

A

give good evidence that the treatments actually cause the
differences observed in the response.

92
Q

What are the factors that create an ideal experiment?

A

Control, randomize, and sample size

93
Q

What is something a well-designed experiment can result in that other types of studies cannot?

A

A causation statement. Associations means causation. A causes B.

94
Q

What is realism?

A

Purpose of experiments. Discovering how the universe and world around us works.

95
Q

What is a block design?

A

subjects are divided into blocks (groups
sharing a given characteristic) before the randomization, in
order to account for possible differences between the
blocks. lets us choose how many individuals of
each block will receive each treatment.

96
Q

What is a main outcome measure?

A

Most important result from experiment

97
Q

What is a match pair design?

A

Combines randomization and matching

98
Q

What is the placebo effect?

A

People think something is helping them, when really it is in their head

99
Q

What is a double-blind experiment?

A

Neither the patients nor experimenters know who is getting a placebo and who is getting the real thing

100
Q

What does bimodal mean?

A

Two peaks (two modes)