stats Flashcards

1
Q

Continous variable

A

reflects a infinite number of potential values such as the average rainfall in a region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Discrete variable

A

countable # of distinct values (heads or tails)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

to determine probability distribution

A

x values must be between 0 and 1 and sum of all must equal 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

population

A

entire group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

sample

A

specific group you collect data from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

statistics

A

number describing a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

parameter

A

number describing the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

accuracy

A

the mothod measures what it intended, the statistic correctly estimates the population parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

precise

A

if the method is repeated, the estimates are very consistent every statistic is nearly the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

sampling methods that create bias

A

convience sampling
voluntary sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

preferred method

A

simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the properties of the sampling distribution

A

Sampling distribution’s mean (μ¯X) = Population mean (μ) Sampling distribution’s standard deviation (Standard error) = σ√n,
shape
central tendency
variabiliy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

example of measurement bias (leading question)

A

do you believe that obama’s horrible beliefs deserve another term in order to ruin our lives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

example of measurement bias (confusing question)

A

do you not disagree with the not recent slight changes to the american culture?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

example of a nonresponse bias

A

do you currently have an std?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

example of voluntary response bias

A

an internet poll asks its visitors if they prefer cats or dogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

example of a sample bias (nonrandom sample)

A

someone asks their twitter followers how they feel about the recent changes to congress

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

how do you measure precision

A

by using standard error

19
Q

as population size increases, do accuracy and precision change?

A

no, both are unaffected

20
Q

as sample size increases, do accuracy and precision change?

A

accuracy in unaffected, and it becomes more precise.

21
Q

what does it mean to say that p-hat is a random variable

A

repeated sampling will result in different p-hat values.

22
Q

Suppose a statistician is interested in determining the percentage of Americans who prefer Burger King to McDonald’s. She surveys 100 randomly chosen Americans and finds that of those surveyed, 37% prefer Burger King.
Identify…
a. the population
b. the sample
c. the parameter
d. the statistic/estimator of the study

A

a. americans.
b. 100 americans.
c. proportion of americans who are burger king to the number of burger king fans.
d. 37%

23
Q

An analyst wants to know if there is a connection between time spent watching TV per day in hours and fat intake per day in grams. He performs a regression using time spend watching TV as the independent variable and fat intake as the dependent variable and finds that r = 0.5 and the regression line is given by: y = 45.8 + 10.3x
a). Explain what the correlation and regression line mean in the context of the data.
b). predict the fat intake of someone who watches 3 hours of tv a day
c). predict y when x=-2
d). which prediction is more reasonable?

A

a. The correlation means there is a moderate positive connection between time spent watching TV and fat intake.
The regression line means that for each additional hour of TV someone watches, we predict their fat intake will increase by 10.3 grams(slope), and the predicted fat intake of someone who watches no TV is 45.8(intercept)
b. 76.7
c. 25.2
d. b is more reasonable because you can’t watch a negative number of hours of tv in a day

23
Q

what are the 4 requirements of the central limit theorem

A

Random and independent sample, population at least 10x the sample size, np ≥ 10, n(1p) ≥ 10; if you don’t know p, use p-hat

24
A pollster is trying to determine whether Candidate X will win an upcoming election(assume Candidate X needs 51% of the vote to win.) The pollster takes a random sample and determines that phat is .49 and the 90% percent confidence interval is given by (.45, .53). a. what is the margin of error? b. can the pollster be confident that candidate x will lose?
a. .04 or 4% b. no, the confidence interval is below 50% so we can't be sure.
25
For each situation, state the appropriate null and alternative hypothesis... a. ohio claims that 23% of high school seniors are enrolled in at least 1 ap class. the principal wants to know if the proportion of seniors enrolled in ap class is higher. b. the cleveland metro police claim 8% of cleveland residents were the victims of a robbery or attempted robbery last year. a statistition believe that this number is too high.
a) H0: p = 0.23 vs Ha: p > 0.23 b) H0: p = 0.08 vs Ha: p < 0.08
26
What should be done to create a confidence interval for a population proportion?
Add and subtract the margin of error to/from the sample proportion
27
Which of the following does the confidence level measure?
The success rate of the method of finding confidence intervals
28
Which of the following conditions regarding sample size must be met to apply the Central Limit Theorem for Samples?
The sample size is large enough that the sample expects 10 successes and 10 failures
29
When taking samples from a population and computing the proportions of each sample, which of the following is always the same?
The population proportion
30
What is the standard deviation of the sampling distribution called?
Standard error
31
A researcher has designed a survey in which the questions asked do not produce a true answer. What is this an example of?
Measurement Bias
32
In a confidence interval, what does the margin of error provide?
How far the estimate is from the population value
33
Which of the following statements are false? I) The precision of an estimator does not depend on the size of the population II) The precision of an estimator does not depend on the size of the sample III) Surveys based on larger sample sizes have larger standard errors
Both II and III are false
34
When applying the central limit therom for sample proportions, which of the following can be substituted for p when calculating the standard error if the value of p is unknown?
The value of the sample proportion
35
If the conditions of a survey sample satisfy those required by the Central Limit​ Theorem, then there is a​ 95% probability that a sample proportion will fall within how many standard errors of the population​ proportion?
2 standard errors
36
The null hypothesis is always a statement about what?
population parameter
37
In hypothesis testing, the null hypothesis is best described by which of the following statements?
The null hypothesis always gets the benefit of the doubt and is assumed to be true throughout the hypothesis testing procedure
38
What is true in a hypothesis test, the farther the test statistic is from 0?
The more null hypothesis is discredited
39
In hypothesis testing, what does an extreme value for the test statistic indicate?
The null hypothesis is not true
40
In hypothesis testing, when should the null hypothesis be rejected?
When the p-value is less than the significance value
41
what do we assume about CLT
random smapling
42
clt conditions to create a valid confidence interval
1. sample must be random and independent 2. normally distrubuted or equal to 30% 3. not more than 10% of population