Exam 1 Flashcards

1
Q

the science of collecting, describing, and analyzing data

A

statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

subjects/objects we obtain information about in a data set

A

cases/units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

any characteristic recorded for each case (columns in the data table)

A

variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

divides the cases into groups, placing each case into exactly one of two or more categories

A

categorical variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

measures or records a numerical quantity for each case

A

quantitative variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

helps explain or predict values of other variables

A

explanatory variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

gives the reason for a specific variable

A

response variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a lurking or confounding variable?

A

a third variable that is not considered
ex: age of children not considered in the reading level/cavity data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

includes individuals or objects of interest

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

subset of the population

A

sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

n =

A

sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

process of using data from a sample to gain information about the population

A

statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

method of selecting a sample causes sample to differ from the population in some relevant way

A

sampling bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

each unit of a population has an equal change of being selected, regardless of the other units chose for the sample

A

simple random sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

difference between sampling bias and bias?

A

sampling bias impacts the sample
bias impacts the actual method of data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

values of one variable tend to be related to the values of another variable

A

association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

how does association and cause relate?

A

association does NOT imply a cause and effect relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

changing the value of one variable influences the value of the other variable

A

causation/casually associated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

_____ implies a particular direction and relationship holds an overall trend

A

causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

a study in which the researcher actively controls one or more of the explanatory variables

A

experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

a study in which the researcher does not actively control the value of any variable but simply observes the values as they naturally exist

A

observational study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what does the word “improve” imply in a study?

A

causality, cannot happen in observational studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

a casual relationship can only be determined in what study?

A

experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

the value of the explanatory variable for each unit is determined randomly, before the response variable is measured

A

randomized experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

randomly assign cases to different treatment groups and then compare results on the response variables

A

randomized comparative experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

each case gets both treatments in random order and examine individual differences in the response variable between 2 treatments

A

matched pairs experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

a summary statistic that helps describe a variable

A

proportion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

how to determine a proportion in a category =

A

number in that category / total number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

proportion for a sample is denoted:

A

p-hat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

p-hat =

A

proportion for a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

proportion for a population is denoted:

A

p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

p =

A

proportion for a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

used to show relationship between 2 categorical values

A

2 way table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

an observed value that is notable distinct from the other values in a data set

A

outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

a numerical average of the data values

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

mean of a sample is denoted:

A

x-bar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

x-bar =

A

mean of a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

mean of a population is denoted:

A

mu

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

mu =

A

mean of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

the middle entry of an ordered list if the list contains an off number of entries

A

median

41
Q

median is denoted:

A

m

42
Q

m =

A

median

43
Q

a statistic that is relatively unaffected by extreme values

A

resistance

44
Q

is median resistant to outliers?

A

yes

45
Q

is mean resistant to outliers?

A

no

46
Q

measures the spread of the data in a sample

A

standard deviation

47
Q

the larger the standard deviation, the ____ variability there is in the data and the _____ spread out the data are

A

more
more

48
Q

standard deviation of a sample is denoted:

A

s

49
Q

s =

A

standard deviation of a sample

50
Q

standard deviation of a population is denoted:

A

σ

51
Q

σ =

A

standard deviation of a population

52
Q

what is the 95% rule?

A

if a distribution of data is symmetric and bell-shaped, 95% of the data should fall within 2 standard deviations from the mean

53
Q

tells how many standard deviations the value is from the mean and is independent of the unit of measurement

A

z-score

54
Q

z-score =

A

(x - xhat) / s

55
Q

the value of a quantitative variable which is greater than p percent of the data

A

percentile

56
Q

what is the 5 number summary?

A

q0 = minimum
q1 = first quartile (25%)
q2 = median
q3 = third quartile (75%)
q4 = maximum

57
Q

range =

A

maximum - minimum

58
Q

interquartile range =

A

q3-q1

59
Q

is range resistant to outliers?

A

NO

60
Q

is interquartile range resistant to outliers?

A

YES

61
Q

is standard deviation resistant to outliers?

A

NO

62
Q

the start of a box in a box plot is at

A

q1

63
Q

the end of a box in a box plot is at

A

q3

64
Q

the line that divides the box in a box plot is

A

the median

65
Q

the lines on a box plot are

A

to the most extreme data value that is not an outlier

66
Q

if the data is skewed left, median _____ mean

A

median greater than the mean

67
Q

if the data is symmetric, median _____ mean

A

equal

68
Q

if the data is skewed right, median _____ mean

A

median smaller than the mean

69
Q

a graph of the relationship between 2 quantitative variables

A

scatterplot

70
Q

for a scatterplot, the _____ variable is on the x axis and the _____ variable is on the y axis

A

explanatory
response

71
Q

a measure of the strength and direction of linear association between 2 quantitative variables

A

correlation

72
Q

correlation of a sample denoted:

A

r

73
Q

correlation of a population denoted:

A

ρ
“rho”

74
Q

correlations closer to 1 are _____

A

stronger

75
Q

for the linear regression line equation y=bo + bi x
what is y?

A

predicted value

76
Q

for the linear regression line equation y=bo + bi x
what is bo?

A

y-intercept

77
Q

for the linear regression line equation y=bo + bi x
what is bi?

A

slope

78
Q

for the linear regression line equation y=bo + bi x
add in where response and explanatory variables would be

A

response = bo+bi(explanatory)

79
Q

difference between the observed and predicted values of the response variable

A

residual

80
Q

equation for residual:

A

observed - predicted
y - y-hat

81
Q

what does a residual represent on a scatterplot?

A

vertical deviation from line to a data point

82
Q

line that minimizes the sum of the squared residuals

A

least squares line

83
Q

do outliers influence regression line?

A

YES

84
Q

data from the principality of andorra were used to determine that 98.9% of andorrans have access to the Internet, the highest rate of any country.
what are the cases in the data from andorra?
what variable is used?
is it categorical or quantitative?

A

cases - people in Andorra
variable - internet access
categorical

85
Q

an online poll conducted on biblegateway.com asked, “how often do you talk about the bible in your normal course of conversation?” over 5000 people answered the question, and 78% of respondents chose the most frequent option: multiple times a week.
can we infer that 78% of people talk about the bible multiple times a week? why or why not?

A

no
biblical website creates bias

86
Q

state whether the sentence implies no association between the variables, association without implying causation, or association with causation:
studies show that taking a practice exam increases your score on an exam.

A

association w/ causation

87
Q

state whether the sentence implies no association between the variables, association without implying causation, or association with causation:
families with many cars tend to also own many television sets.

A

association implying causation

88
Q

state whether the sentence implies no association between the variables, association without implying causation, or association with causation:
sales are the same even with different levels of spending on advertising.

A

no association

89
Q

state whether the sentence implies no association between the variables, association without implying causation, or association with causation:
taking a low-dose aspirin a day reduces the risk of heart attacks.

A

association with causation

90
Q

state whether the sentence implies no association between the variables, association without implying causation, or association with causation:
goldfish who live in large ponds are usually larger than goldfish who live in small ponds.

A

association implying causation

91
Q

state whether the sentence implies no association between the variables, association without implying causation, or association with causation:
putting a goldfish into a larger pond will cause it to grow larger.

A

association with causation

92
Q

a nationwide US telephone survey conducted by the pew foundation1 asked 2625 adults ages 18 and older, “some people say there is only one true love for each person. do you agree or disagree?” In addition to finding out the proportion who agree with the statement, the pew foundation also wanted to find out if the proportion who agree is different between males and females, and whether the proportion who agree is different based on level of education (no college, some college, or college degree). the survey participants were selected randomly, by landlines and cell phones.
what are the cases in the survey about one true love?
what are the variables?
are the variables categorical or quantitative?
how many rows and how many columns would the data table have?

A

cases - 2625 people
variables:
do u agree? - categorical
gender - categorical
level of education - categorical
2625 rows, 3 columns

93
Q

give the notation for the mean:
for a random sample of 50 seniors from a large high school, the average SAT score was 582 on the math portion of the test.

A

x-bar = 582

94
Q

give the notation for the mean:
about 1.67 million students in the class of 2014 took the SAT,28 and the average score overall on the math portion was 513.

A

mu = 513

95
Q

the five number summary for the mammal longevity data in table 2.21 on page 73 is (1, 8, 12, 16, 40). find the range and interquartile range for this dataset.

A

range: 40-1 = 39
IQR: 16-8 = 8

96
Q

use the regression line to predict the tip of a bill that is $59.33
tip = -0.292 + 0.182 (bill)

A

10.51

97
Q

use the regression line to predict the tip of a bill that is $9.52
tip = -0.292 + 0.182 (bill)

A

$1.44

98
Q

use the regression line to predict the tip of a bill that is $23.70
tip = -0.292 + 0.182 (bill)

A

$4.02