non highlighted Flashcards

1
Q

categorical variable

A

a categorical variable is placed an individual into one of several groups or categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

quantitative variable

A

a quantitative variable has numerical values and it makes sense to find the average value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

association

A

there is an association between two variables if knowing the value of one variable helps predict the value of the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

mean

A

average value of the observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

median

A

midpoint of the values, also called Q2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

first third quartiles

A

Q1 has about one-fourth of the observations below it, and Q3 has about three fourths of the observations below it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

interquartile range

A

IQR is the range of middles 50% of the observations IQR =Q3-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

standard Deviation

A

measures the typical distance of the values in a distribution from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

variance

A

average squared deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

shape

A

typical shapes of a distribution are roughly symmetric, skewed left and skewed right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

center

A

mean for roughly symmetric distributions, median for skewed distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

spread

A

standard deviation for roughly symmetric distributions, IQR for skewed distributions. Range = man-min as a last resort

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

transforming data by add/subtract a

A

measure of center (median and mean) and location (quartiles and percentiles) change by a measure of spread don’t change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

transforming data by multiply/ divide b

A

measure of center, location, and spread change by b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Density curve mean and median

A

the mean is the balance point of the curve. The median divides the area under the curve in half

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

uniform distribution

A

a distribution that takes constant height over some interval of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

68-95-99.7 rule

A

percent of observations that lie within one tow and three standard deviations of the mean in a normal curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

normal probability plot

A

if the normal probability plot is roughly linear, then the data is apporiximately normal
if the normal probability is not roughly linear then the data is not approximately normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

scatterplot

A

displays the relationship between two quantitative variables measured on the same individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

explanatory variable, factor, response variable

A

if we think that a variable x may help explain, predict or even cause changes in anohter variable y, we call x an explanatory variable and y a response variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

correlation r

A

meaures the direction and strength

r has no units, is between -1 and +1 and is not the value of the slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

correlation and causation

A

correlation does not imply causation, no matter how strong there may be other confounding variables

23
Q

least squares regression line

A

the straight line y hat = a+bx that minimizes the sum of the squares of the veritcal distances of the observed points from the line

24
Q

slope b

A

b is the predicted change in y when x increases by 1 unit in context

25
Q

y intercept a

A

predicted resonse y hat value when the explanatory variable x equals 0, in context

26
Q

extrapolation

A

avoid extrapolation the use of a regression line for prediction using values of the explanatory variable outside the range of the data

27
Q

residual

A

y- y hat the difference between the observed and predicted values of y

28
Q

influentials

A

outliers that substantially change the correlation or the regression line’s slope or y intercept

29
Q

census

A

census collects data from every individual in the population

30
Q

convenience sample

A

choose individuals who are easiest to reach

31
Q

voluntary response sample

A

individyals choose to join the sample in response to an open invitation key terms phone in survey, TV survey

32
Q

simple random sample

A

SRS uses chance prosses to give every possible sample of a given size the same chance to be chosen. choose an srs by labeling the members of the population and using slips of paper, technology or random digits table to select the sample

33
Q

stratified random sample

A

divide the population into strata, groups of individuals that are similar in some way that might affect their responses. Then choose a separate SRS form each stratum and combine these SRSs to form the sample

34
Q

cluster sample

A

divide the population in clusters, groups of individuals that are located near each other. Randomly select some of these clusters.. All the individuals in the chosen clusters are included in the sample

35
Q

undercoverage

A

when some members of the population cannot be chosen to be in the sample

36
Q

reponse bias

A

when a systematic pattern of inaccurate answers leads to resonse bias

37
Q

nonresponse bias

A

when people can’t be contacted or refuse to answer

38
Q

wording bias

A

wehn confusing or leading questions introduce stron gbias

39
Q

observational study

A

gathers data on individuals as they are

40
Q

experiment

A

deliberatly imposes treatments on experimental units

41
Q

experimental units

A

each of the individuals to which treatments are applied. Human experimental units are called subjects

42
Q

confounding

A

variables are confounded whe their effects on a response variable can’t be distinguished from that of the explanatory variable

43
Q

completely randomized design

A

all experimental units are assigned to the treatments completely by chance

44
Q

placebo

A

a fake treatment for the control group. That prevents confounding due to the placebo effect, in which some patients get better because they expect the treatment to work.

45
Q

double blind experiment

A

neither the subjects nor those interacting with them and measuring their responses know who is receiving which treatment. If one party knows and the other doesn’t then the experiment is single blind

46
Q

randomized block design

A

use blocks of experimental units that are similar with respect to a variable that is expected to affect the response. Treatments are assigned at random within each block. Responses are then compared within each block and combined with the reponses of other blocks after accounting for the differences between the blocks

47
Q

matched pairs design

A

in some matched paris designs, each subject receives both treatments in a random order. in others, two very similar subjects are paired, and the two treatments are randomly assigned within each pair

48
Q

mutually exclusive and independence

A

if two events are mutually exclusive, they cannot also be independent

49
Q

probabilty distribution

A

the probabilty distribution of a random variable gives its possible values with gaps between

50
Q

continuos random variable

A

a continuous random variable x takes all values in an interval of numbers. the probability distribution of x is described by a density curve. The probability of any event is the area under the density curve and above the values of x that make up the even

51
Q

population parameter/ sample statistics

A

a parameter is a number that descrives a population. To estimate an unknown parameter, use a statistic calculated from a sample

52
Q

sampling distribution

A

the sampling distribution of a statistic a statistic describes the values of the statistic in all possible samples of the same size from the same population

53
Q

unbiased estiamator

A

a statistic is an unbiased estimator if the center (mean) of its sampling distribution is equal to the true value of the parameter