ALL INFERENCE U5, U6 and U7 MIXED Flashcards

1
Q

How do you find df in 2 samples?

A

EASY WAY: smaller sample size minus one.

this is a conservative guess

or the hard way

you have to run an interval or a test on your TI and read the output (unless you want to use the equation?.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can you decrease alpha and beta at the same time?

A

increase sample size.

this will also increase power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What if you want more cofidence with same size interval?

A

increase your sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

For the following output for the association between test score and amount of time studied, What is the “5.82?”

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

A

That is the slope.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how are 2 samp t and paired t different?

A

2 samp t you are loooking at a difference between 2 averages from 2 distinct sample. With a paired test you make a SINGLE LIST OF DIFFERENCES (L3) from each pair, you then look at the AVERAGE DIFFERENCE, the average of a bunch of differences

(generally 2 measurements on just ONE sample).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you POOL with PROPORTIONS?

A

You combine the two samples into one big sample?

TOTAL # RED BEADS / OVERALL TOTAL OF ALL BEADS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is statistical inference?

A

Using a statistic to infer something about a parameter.. Basically, using a sample to say something about a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what happens to t models as n gets larger?

A

The models look more like the normal model. An infinite sample size would turn a t model into the normal model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

For the following output for the association between test score and amount of time studied, Create and interpret a 95% confidence interval for slope.

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

A

STAT +- CRIT (SE)

5.82 +- CRIT (2.66)

crit is just INVT(.025, 73)

df is n-2 for regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is alpha?

A

It is the rejection threshold. You reject p-values below it.. It is how willing you are to make a Type 1 error.

alpa=P(Type I error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

For the following output for the association between test score and amount of time studied, interpret the S in context.

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

A

S is the standard deviation of the residual, or the typical residual. It is how far off we expect our actual data value to be from the model (from the predicted value). In context: We can expect our actual test score to be about 7.7 points off from the test score predicted by our model, based on the amount of time we studied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the mean and standard deviation of a sampling distribution for a mean?

A

mean is mu and standard deviation is sigma/root n

(look at formula sheet) N(mu, sigma/rootn)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you calculate SAMPLE SIZE with proportions?

A

n = Z2 pq / ME2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what does 95% confidence mean when we make an interval?

A

It means if we took a ton of samples, and made confidence intervals from each of them, ABOUT 95% of the intervals would contain the parameter, 5% would not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a confidence interval?

A

it is a parameter catcher.. Like a fishing net?. We stand at our statistic, and reach up and down a margin of error, WE ARE NOT IN THE MIDDLE OF THE PILE!!! and hope to CATCH the parameter? sometimes we do, sometimes we don’t? but we never know.. Mooo hooo hooo haaaa haaa haaa (evil laugh)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are conditions for chi squared?

A

indep, rand, <10%, 5 or more in EXPECTED cells

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

THINK OF Type 2 error?

A

“MISSED OPPORTUNITY” “YOU ARE SICK, BUT WE MISSED IT” “THE PROGRAM WORKED, BUT WE DIDN’T NOTICE”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is effect size?

A

difference between null and true parameter.

something we don’t know

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the conditions that have to be met in order to use a normal model for the distribution of sample proportions? (sampling distribution of proportions).. (the distribution of p-hats)..

A
  1. Randomization (this helps with assumption of independence
  2. SMALL ENOUGH SAMPLE … 10% condition (this is the upper limit of our sample size. above this, the sampling distribution starts looking leptokurtic (thinner and taller), not normal)
  3. LARGE ENOUGH SAMPLE.. success/failure: np and nq > 10. this is the lower limit of our sample size. It is when the sampling distribution starts looking normal.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Where did the s.d. of differences of proportions that is on the formula sheet come from?

A

From the square root of the added variances of the the sampling distributions of the 2 proportions?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you calcuate SAMPLE SIZE with means?

A

n= (t*s/ME)2

Often, z is fine.. but

First calculation, use Z crit.. Then go through and calculate n.. Use that n for a t crit and do it again.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is difference between assumptions and conditions?

A

Assumptions must be made in order to perform inference. We need to assume independent sample values and a large enough sample (but not too large).

We check conditions to help support our assumptions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When do you know it is GOF test?

A

When you have ONE ROW or ONE COLUMN.

then it gives you a ratio , like 1:2:5

or it gives you expected percents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Describe the distribution of a sample

A

It will look like the population. The distribution of a sample is a histogram made from the sample, which will look kind of like the population. If the population is bimodal, then the distribution of the sample is bimodal. The SAMPLING distribution of a bunch of means, however, will look normalish.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

With regression computer output, how do you find the p-value for hypothesis test with null: slope=0

A

p value is given at the end of the row that the slope is in!

It is the SLOPE/SE (because the t is [slope - 0)/ SE]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

THINK OF Power?

A

ability to detect change, or to detect what test was designed to detect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the Fundemental Theorem of Statistics?

A

The CLT!!

The Central Limit Theorem!

pile of stats surrounds the parameter and is normalish with large enough sample size!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

For the following output for the association between test score and amount of time studied, interpret the r-squared in context.

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

A

77.5% of the variability in test score can be explained by the model with hours studied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How else can you explain power?

A

The likelihood you correctly reject a false null

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

When do you use 1 prop z test instead of one prop t test?

A

There is no 1 prop t test. You use Z for props. T for means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is a margin of error?

A

critical * s.d..

It is how far you reach out in a confidence interval..

You reach up AND YOU REACH DOWN one of these,

so the interval is actually 2 margins of error wide.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Your confidence interval is (.25, .35). What is your point estimate?

A

0.3

(UB+LB) / 2,

avg of the numbers,

in the middle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

POWER + BETA =

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

How can you increase power?

A

Increase alpha

increase sample size..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Can you decrease alpha while increasing power (even though they move together?)..

A

Yes.. They move together with constant sample size. If you increase the sample size, you can decrease alpha and increase the power.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Are models what really happens?

A

No. A model train is not a real train. We use models to say what kind of happens.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

If you are testing to see if more students use tobacco now, and you find that was enough evidence to say that more do, but actually, there was not an increase, what type of error did you make?

A

Type 1 error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

How are power and alpha related?

A

they go up and down together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

If the null is true, what is the only error you could make?

A

Type 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

How do you write conclusion if you fail to reject?

A

With a p-value this high. I fail to reject the null. (I retain the null). There is not enough evidence to say that more students like eggs now.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Do alpha and beta work with means?

A

Yep, alpha, beta, power, type 1, type 2 all go along with means and proportions AND regression AND CHI SQUARED!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What are the mean and standard deviation of a sampling distribution for a proportion?

A

mean is p and sdandard deviation is root pq/n

(look at formula sheet)

N(p, root (pq/n) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

When we are looking at differences of proportions, what is the sampling distribution a distribution of?

A

You have to imagine taking a a pair of samples, say.. Of girls and boys, subtracting phat girl-phat boy, and then writing that difference down. Do this over and over again, and you will have a list of differences. Now make a histogram of that list of differences, and that is your sampling distribution. It is an imagined distribution of an infinite amount of differences (of sample pairs)..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is a 2 sample t interval?

A

Well.. Suppose you were trying to find the difference between the IQ of math teachers and IQ of English teachers. You sample 50 of each and find math xbar= 125 and English xbar=115. So, the difference of the samples is 10 points. That is your statistic. 10 points. You now have to add on a margin of error, let’s say.. 4. so, youll say something like “I’m 90 % confident that math teachers score between 6 and 14 points higher on IQ tests.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

For the following output for the association between test score and amount of time studied, what is the equation of the LSRL?

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

A

Score HAT = 45.3 + 5.82 (hours studied)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What did Bill Gossett Do?

A

He sat on the Normal model and drank some tea. (t model looks like someone sat on the normal model)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What does CLT say about the distribution of the population?

A

Not much

just that it doesn’t matter what it is..

With large samples..

The SAMPLING dist will be approx normal

(dist of stats.. NOT DATA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Do you pool with means (t test)?

A

No you don’t have to do it. Only pool with hyp tests for props. Pooling with means is a nasty process, if you think the populations have similar variances, then have your calculator pool.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

what are the conditions for chi squared?

A

counts, five or more in each expected, independent (random), <10%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

THINK OF Type 1 error?

A

“BUT I THOUGHT THINGS CHANGED” or “BUT I THOUGHT IT WORK” or “BUT I THOUGHT YOU WERE SICK”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Do you use p-hat or p-null when you calculate your standard deviation?

A

use p-null..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

when do you need crits?

A

in confidence intervals

(and old fashioned hyp tests)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

What are the conditions that have to be met in order to use a t-model for the distribution of sample means? (sampling distribution of means).. (the distribution of x-bars)..

A
  1. Random and independence
  2. Not too large, less than 10% of population,

so 10n<n></n>

<p>3. NOT TO SMALL n&gt;30</p>

<p>(if n&lt;30 must be normalish)</p>

</n>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

how do you find expected count if n=25 for a 1:3:1 ratio? What test is it?

A

GOODNESS OF FIT

find total: 1+3+1 = 5 divide all by five and that gives expected percents

1/5 : 3/5 : 1/5 .20 : .60 :.20

now multiply each by n and get expected counts.

Almost always not a whole number. 25(.20) : 25(.60) :25(.20)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

How to find expected cell count on a matrix?

A

ROW TOTAL* COLUMN TOTAL/ OVERALL TOTAL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

How do you make a confidence interval with computer output?

A

STAT +/- CRIT SE

The STAT and SE are given side by side

the t crit is stilll INVT(area 1 tail, n-2),

Just put the +/- t crit between the actual slope and the given std. error. The calculation is simple

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

How do statistics from big samples compare to small?

A

Larger sample statistics have less variablility, so statistics from them are closer to the parameter and eachother (sampling distribution has smaller standard error). Statistics from smaller samples are more likely to be far away from true parameter.

This is why increasing sample size increases power, it narrows the piles of statistics.

58
Q

how do you find deg freedom?

A

n-1 for one sample

2 samples you must use calculator..

REGRESSION IS n-2

GOF is cells - 1

indep and homog is (r-1)(c-1)

59
Q

who invented the t model?

A

Bill Gosset, guiness brewing company.

60
Q

What is the quick sample size calculation for proportions?

A

1/ME2

61
Q

For the following output for the association between test score and amount of time studied, Test a hypothesis to see if there is a significant association between time studied and test score.

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

A

Ho: B1 = 0

Ha: B1 not= 0

(imagine a pile of sample slopes centered at 0 with SE of

BE SURE THEY ARE BETAS (Greek B’s) small b is a sample slope, Beta is population slope

P value is given: 0.0159, so interpret it !

62
Q

What is diff between homogeneity and test for independence?

A

homogeneity is more than one sample and asking about one variable, independence is just one sample with two variables.

63
Q

Your confidence interval is (.25, .35). What is your margin of error?

A

0.05

64
Q

What is “statistically significant?”

A

When p-value is below the alpha, we say “statistically significant”.. Low p-values are statistically significant. When our sample most likely didn’t happen randomly, that is statistically significant.

65
Q

how are alpha and beta related?

A

as one increases, the other decreases, and vice versa

66
Q

Simple quick way to describe difference between paired T and 2 sample T.

A

Paired T looks at average of differences, 2 sampe T looks at the difference of averages.

67
Q

what is df for goodness of fit?

A

cells - 1

68
Q

Can you draw the alpha/beta/power diagram?

A

See page 486. Be able to draw and label this the way we do in class with the box and “RETAIN REJECT” up top and “Ho TRUE, Ho FALSE” on left.

69
Q

How do you write conclusion if you reject?

A

With such a low p-value, I reject the null hypothesis. There is strong evidence that the proportion of students who eat rice has changed.

70
Q

what is a statistic

A

some numerical summary of a sample.. Could be the mean of a sample, the standard deviation of a sample, the proportion of successes in a sample, the slope calculated from a sample, a difference of 2 means from 2 samples, a difference of 2 proportions from 2 samples, a difference of 2 slopes from 2 samples.. you can make sampling distributions for any of these, and they will all be centered around the parameter…

71
Q

What does the CLT say about the distribution of actual sample data?

A

Nothing? The sample will be distributed similar to the population. The CLT only talks about distributions (histograms) of sample statistics, which are groups of means.., NOT OF INDIVIDUALS!!!! NOT DATA

72
Q

What is power?

A

The probability that you correctly rejected a false null

73
Q

What does Central Limit Theorem Say?

A

It basically says.. NO MATTER WHAT SHAPE THE POPULATION IS (normal, bimodal, uniform, skewed, crazy.. )? If you make a histogram of a bunch of means taken from a bunch of samples, that histogram will be unimodal and symmetric WITH LARGE ENOUGH SAMPLES.. Close to normal. So.. A nerdy way to say it is: The sampling distribution of means is approximately normal no matter what the population is shaped like. The larger the sample size, the closer to normal. (the normal curve is just a model.. the sampling distribution is close to it, but not it! we use the model anyway!)

74
Q

What is a point estimate?

A

Your statistic. You stand at the point estimate and reach up and down to make an interval

75
Q

For the following output for the association between test score and amount of time studied, Where does the “p” column come from?

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

r-square: 77.5 S=7.7

A

The p is the p value from a hypothesis test assuming the y intercept and the slope are zero. If you put those T-STATS into tcdf, you get the p values from this column.

76
Q

What are the three chi-squared models?

A

goodness of fit, test for homogeneity, test for independence

77
Q

How do you find degrees of freedom for 2 sample mean stuff?

A

MUST USE CALCULATOR

78
Q

WHAT EQUATION HAS INTERVAL WIDTH, Z CRIT and SE IN IT?

A

WIDTH = 2 (Z) (SE)

79
Q

how are beta and power related

A

as one increases, the other decreases, and vice versa? They have to because they BOTH ADD TO ONE!!! Power + Beta = 1

80
Q

Where did the s.d. of differences ofmeans that is on the formula sheet come from?

A

Combining models. From the square root of the added variances of the the sampling distributions of the 2 means

81
Q

What is the common misconception about confidence intervals?

A

They know you stand at your stat and reach up and down? BUT?. People think that their statistic is in the middle of the pile of p hats or x bars? in reality, they are almost definitely NOT? they are out on one of the sides, and they are reaching up and down and trying to catch the center!!! YOU ARE NOT IN THE MIDDLE OF THE PILE!!

82
Q

For the following output for the association between test score and amount of time studied, interpret the y intercept in context.

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

r-square: 77.5 S=7.7

A

The model predicts that a person who doesn’t study at all will score about a 45.3 on the test.

83
Q

How do you POOL with MEANS?

A

YOU DON’T

84
Q

Can you prove a null hypothesis true?

A

NO

85
Q

How is a confidence interval made?

A

statistic +- margin of error ??.. Statistic +- (crit * s.d )??. Stand at the statistic, reach out a margin of error, and hope that you catch the parameter.

86
Q

what is a parameter?

A

some numerical summary of a population. Often called “the parameter of interest.” It is what we are often trying to find.. It doesn’t vary. It is out there and STUCK at some value, it is the truth, and you’ll probably not ever know it! We try to catch them in our confidence intervals, but sometimes we don’t (and we don’t know it!). It Could be the mean of a population, the standard deviation of a population, the proportion of successes in a population, the slope calculated from a population, a difference of 2 means from 2 population, a difference of 2 proportions from population

87
Q

What are the conditions that have to be met in order to use a normal model for the distribution of sample proportions? (sampling distribution of proportions).. (the distribution of p-hats)..

A
  1. Random and independence 2. Not too large, less than 10% of population, so 10n 10. . Too large samples have sampling distributions that are leptokurtic (narrower and taller than normal model), too small samples have skewed or other shaped sampling disributions.
88
Q

how do you find z and t crit?

A

for z crit.. INVNORM(area in 1 tail)? for t crit?. INVT(area in 1 tail, deg freedom)

89
Q

One tail or 2 tailed? How do you tell?

A

if it just says “changed” or “different”.. Then it is 2 sided.. DOUBLE THE P VALUE!If it says “more” “less than” “greater” etc.. Then it is just one sided..

90
Q

For the following output for the association between test score and amount of time studied, where does the “t stat” column info come from?

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

r-square: 77.5 S=7.7

A

T stat is the T score, the test statistic for null=0:

(STAT-NULL)/SE (45.3 - 0) / 4.3 = 10.53

(5.82-0) / 2.66 = 2.189

91
Q

how can you decide the right test? What are the 3 questions?

A

1 or 2 samples? Proportions (z) or Means (t)? Test or Interval?

92
Q

For the following output for the association between test score and amount of time studied, what is the correlation coefficient?

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

r-square: 77.5 S=7.7

A

if r squared is .775, then you have to take sqrt of that.. So r= 0.8803

93
Q

how are t models like Normal models?

A

both are unimodal and symmetric. T models aren’t as high and have more area in tails, that?s why you have to reach out a little further than z for same confidence. Bill sat on the normal model.

94
Q

what are the conditions for inference for slope? (hyp test or confidence interval for slope?)

A

straight enough (check residuals for random scatter), random and independent, and look at the HISTOGRAM OF THE RESIDUALS and make sure they are unimodal and symmetric

95
Q

what is df for chi squared homogeneity or independence?

A

(rows-1)(columns - 1) (remove a row and a colunmn an count the cells that are left)

96
Q

For the following output for the association between test score and amount of time studied, Interpret the slope in context.

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

A

On average, for each hour a student studied more than another student, their test scores were about 5.82 points higher.

97
Q

what is a test statistic?

A

a t or z score (or chi squared) that you use to find a p value

98
Q

What is sampling variability?

A

The natural variation of sample statistics.. NOT DATA.. Samples vary? so do their statistics.. Parameters do not vary!

99
Q

Where did Bill Gossett work?

A

GUINNESS

100
Q

Is a confidence interval a PROBABLILTY?

A

NO

101
Q

xbar and mu in t-test?

A

xbar is your sample mean, mu is yor hypothesized population mean

102
Q

How do you find Expected Count?

A

for GOF: Exp %(total).. For indep and homog: ROW*COL/TOTAL

103
Q

With regression computer output, how is the t-ratio and the p-value calculated?

A

T ratio is just SLOPE/ST ERROR and the p value is just TCDF(T ratio, 9999, n-2)

104
Q

What is advantage of pooling?

A

Pooling allows you to increase your sample size? sort of,

105
Q

If you are testing to see if more students use tobacco now, and you find that there was not enough evidence to say that more do, even though more actually do now, what type of error did you make?

A

Type 2 error

106
Q

Your confidence interval is (.25, .35). What is your standard error?

A

it depends on the critical value.

107
Q

what are the conditions that have to be met for t procedures with small samples?

A

independent groups, random, <10% of pop and nearly normal.

108
Q

What is a critical value?

A

It is the amount of standard deviations (errors) you’ll reach out, depending on your confidence (a t or z). Example.. 68% crit z = 1 ?.. For 95% crit z = 2 (well, 1.96).. For means.. Use t crits

109
Q

What is a standard error?

A

typical distance a statistic is from the parameter. The avereage distance to the middle in a sampling distribution. Called standard error because it is the typical error you would espect in a sample.

110
Q

If the null is false, what is the only error you could make?

A

Type 2

111
Q

What are the 3 steps in hypothesis testing AFTER YOU CHECK CONDITIONS?

A
  1. Make your Ho and Ha2. Make a Null Model (centered at null, use your Ho as center and in calculations, use your sample size).. This is a sampling distribution for the statistics if the null were true.3. CHECK? Calculate your statistic (p-hat, x-bar, phat1-phat2, xbar1-xbar2)
112
Q

Do you use p-hat or p-null when you check the success/failure condition?

A

use p null

113
Q

Your confidence interval is (.25, .35). What is your statistic?

A

0.3 (UB+LB) / 2, avg of the numbers, in the middle

114
Q

If you fail to reject, what is the only type of error you could make?

A

Type 2

115
Q

What is sampling error?

A

same as sampling variability.. The natural variability between STATISTICS.. NOT DATA!!! . We call it error EVEN THOUGH YOU MADE NO MISTAKES!!!

116
Q

What is beta?

A

It is probability that you’ll make a Type II error.. P(Type II error)

117
Q

If you reject, what is the only type of error you could make?

A

Type 1

118
Q

when is data “paired”

A

when you have 2 measurements on the same subject (or matched subjects). Often Before-After

119
Q

What is a 2 sample t interval?

A

Well.. Suppose you were trying to find the difference between the IQ of math teachers and IQ of English teachers. You sample 50 of each and find math xbar= 125 and English xbar=115. So, the difference of the samples is 10 points. That is your statistic. 10 points. You now have to add on a margin of error, let’s say.. 4. so, youll say something like “I’m 90 % confident that math teachers score between 6 and 14 points higher on IQ tests.”

120
Q

What are we confident in?

A

our confidence lies in our interval? if we took another sample.. We’d have a different interval..

121
Q

when do you have to check conditions?

A

In all inference procedures: ANY CONFIDENCE INTERVAL OR HYPOTHESIS TEST (including chi squared and slope stuff)

122
Q

What are conficence intervals for?

A

They are an attempt to say what the true population parameter is.. It is our best guess? “We think that there will be between 6 and 12 inches of snow?”

123
Q

N ( ?1 , ?2 ) what does this mean?

A

it means NORMAL models centered at ?1 With a standard deviation of ?2

124
Q

what is error?

A

distance from statistic to parameter, how far you sample statistic is off from the truth.

125
Q

What is a p-value

A

It is the probability of getting your sample randomly if the null were true. Basically, how likely is it that your sample statistic came from the Null Model.

126
Q

Why does the book use ybar instead of xbar?

A

I don’t know

127
Q

Will 95% of other statistics be within my interval?

A

NO!!! You have no idea where your interval is in regards to true parameter

128
Q

Can you make a 100% confidence interval?

A

Sure, I’m 100% confident that it will snow between 0 and 500 feet tomorrow?

129
Q

Can you accept a null hypothesis?

A

Never accept a Ho

130
Q

If you were going to pool with means (t), which you probably won’t have to do, when would you?

A

When you have reason to believe the variances of both populations are equal.

131
Q

What is the difference between the distribution of a sample and a sampling distribution?

A

A distribution of a sample is just a histogram of the DATA in a sample. A sampling distribution is made from an bunch of sample STATISTICS. It is the distribution of the statistic that was calculated from those many many samples.

132
Q

What is difference between population of interest and parameter of interest?

A

Population is the subjects you are interested in? Parameter is the actual number you want (like % of or AVG)

133
Q

How wide is a confidence interval?

A

It is 2 margins of error wide

134
Q

What is a t-crit?

A

It is the same as z crit. It is the number of sd you reach out in your CI. To find it, do INVT(area in one tail, degrees of freedom)

135
Q

For the following output for the association between test score and amount of time studied, make a 90% confidence interval for the predicted score for someone who studied for 8 hours.

OUTPUT: n=75 dep var: test score
VAR coeff se coeff t stat p
intercept 45.3 4.3 10.53 0.000
study time 5.82 2.66 2.189 0.0159
r-square: 77.5 S=7.7

A

To make an interval for prediction, plug 8 into equation..

45.3 + 5.82 ( 8 ) = 91.86

stand there and go up and down CRIT S

Use S because it is the st dev of resid.

STAT +- CRIT (SE)

91.86 +_ Tcrit (7.7)

Tcrit is invt(.05, 73)

136
Q

how do you check nearly normal for small samples?

A

Histogram on calculator, or normal prob plot on calculator, .OR Boxplot should be symmetrical?

137
Q

What is the normal enough condition?

A

for smaller sample sizes, it must be plausible that the sample may have kind of come from a normalish population.

138
Q

Do parameters vary?

A

NO!!! Statistics do? they vary from sample to sample? PARAMETERS DO NOT VARY!

139
Q

What if you want more confidence?

A

get a bigger net.. (wider conficence interval) or increase sample size

140
Q

What is a null model?

A

It is a sampling distribution. It tells us how sample statistics would vary if the null were true. It is centered at the null.

141
Q

Why does increasing n increase power?

A

When you increase n, the pile of statistics gets more narrow, the statistics get closer to the true mean. Imagine your alpha beta power diagram with narrower piles, but the same distance from eachother, the power would increase!