ALL INFERENCE U5, U6 and U7 MIXED Flashcards

Question

With regression computer output, how do you find the p-value for hypothesis test with null: slope=0

Answer 1

p value is given at the end of the row that the slope is in! It is the SLOPE/SE (because the t is [slope - 0)/ SE]

Answer 2

ability to detect change, or to detect what test was designed to detect.

Answer 3

The CLT!! The Central Limit Theorem! pile of stats surrounds the parameter and is normalish with large enough sample size!!

Answer 4

77.5% of the variability in test score can be explained by the model with hours studied.

Answer 5

The likelihood you correctly reject a false null

Answer 6

There is no 1 prop t test. You use Z for props. T for means.

Answer 7

critical \* s.d.. It is how far you reach out in a confidence interval.. You reach up AND YOU REACH DOWN one of these, so the interval is actually 2 margins of error wide.

Answer 8

0.3 (UB+LB) / 2, avg of the numbers, in the middle

Answer 9

Increase alpha increase sample size..

Answer 10

Yes.. They move together with constant sample size. If you increase the sample size, you can decrease alpha and increase the power.

Answer 11

No. A model train is not a real train. We use models to say what kind of happens.

Answer 12

Type 1 error

Answer 13

they go up and down together

Answer 14

With a p-value this high. I fail to reject the null. (I retain the null). There is not enough evidence to say that more students like eggs now.

Answer 15

Yep, alpha, beta, power, type 1, type 2 all go along with means and proportions AND regression AND CHI SQUARED!!

Answer 16

mean is p and sdandard deviation is root pq/n (look at formula sheet) N(p, root (pq/n) )

Answer 17

You have to imagine taking a a pair of samples, say.. Of girls and boys, subtracting phat girl-phat boy, and then writing that difference down. Do this over and over again, and you will have a list of differences. Now make a histogram of that list of differences, and that is your sampling distribution. It is an imagined distribution of an infinite amount of differences (of sample pairs)..

Answer 18

Well.. Suppose you were trying to find the difference between the IQ of math teachers and IQ of English teachers. You sample 50 of each and find math xbar= 125 and English xbar=115. So, the difference of the samples is 10 points. That is your statistic. 10 points. You now have to add on a margin of error, let's say.. 4. so, youll say something like "I'm 90 % confident that math teachers score between 6 and 14 points higher on IQ tests."

Answer 19

Score HAT = 45.3 + 5.82 (hours studied)

Answer 20

He sat on the Normal model and drank some tea. (t model looks like someone sat on the normal model)

Answer 21

Not much just that it doesn't matter what it is.. With large samples.. The SAMPLING dist will be approx normal (dist of stats.. NOT DATA)

Answer 22

No you don't have to do it. Only pool with hyp tests for props. Pooling with means is a nasty process, if you think the populations have similar variances, then have your calculator pool.

Answer 23

counts, five or more in each expected, independent (random), \<10%

Answer 24

"BUT I THOUGHT THINGS CHANGED" or "BUT I THOUGHT IT WORK" or "BUT I THOUGHT YOU WERE SICK"

Answer 25

use p-null..

Answer 26

in confidence intervals (and old fashioned hyp tests)

Answer 27

1. Random and independence 2. Not too large, less than 10% of population, so 10n

3. NOT TO SMALL n>30

(if n<30 must be normalish)

Answer 28

GOODNESS OF FIT find total: 1+3+1 = 5 divide all by five and that gives expected percents 1/5 : 3/5 : 1/5 .20 : .60 :.20 now multiply each by n and get expected counts. Almost always not a whole number. 25(.20) : 25(.60) :25(.20)

Answer 29

ROW TOTAL\* COLUMN TOTAL/ OVERALL TOTAL

Answer 30

STAT +/- CRIT SE The STAT and SE are given side by side the t crit is stilll INVT(area 1 tail, n-2), Just put the +/- t crit between the actual slope and the given std. error. The calculation is simple

Answer 31

Larger sample statistics have less variablility, so statistics from them are closer to the parameter and eachother (sampling distribution has smaller standard error). Statistics from smaller samples are more likely to be far away from true parameter. This is why increasing sample size increases power, it narrows the piles of statistics.

Answer 32

n-1 for one sample 2 samples you must use calculator.. REGRESSION IS n-2 GOF is cells - 1 indep and homog is (r-1)(c-1)

Answer 33

Bill Gosset, guiness brewing company.

Answer 34

Ho: B1 = 0 Ha: B1 not= 0 (imagine a pile of sample slopes centered at 0 with SE of BE SURE THEY ARE BETAS (Greek B's) small b is a sample slope, Beta is population slope P value is given: 0.0159, so interpret it !

Answer 35

homogeneity is more than one sample and asking about one variable, independence is just one sample with two variables.

Answer 36

When p-value is below the alpha, we say "statistically significant".. Low p-values are statistically significant. When our sample most likely didn't happen randomly, that is statistically significant.

Answer 37

as one increases, the other decreases, and vice versa

Answer 38

Paired T looks at average of differences, 2 sampe T looks at the difference of averages.

Answer 39

See page 486. Be able to draw and label this the way we do in class with the box and "RETAIN REJECT" up top and "Ho TRUE, Ho FALSE" on left.

Answer 40

With such a low p-value, I reject the null hypothesis. There is strong evidence that the proportion of students who eat rice has changed.

Answer 41

some numerical summary of a sample.. Could be the mean of a sample, the standard deviation of a sample, the proportion of successes in a sample, the slope calculated from a sample, a difference of 2 means from 2 samples, a difference of 2 proportions from 2 samples, a difference of 2 slopes from 2 samples.. you can make sampling distributions for any of these, and they will all be centered around the parameter...

Answer 42

Nothing? The sample will be distributed similar to the population. The CLT only talks about distributions (histograms) of sample statistics, which are groups of means.., NOT OF INDIVIDUALS!!!! NOT DATA

Answer 43

The probability that you correctly rejected a false null

Answer 44

It basically says.. NO MATTER WHAT SHAPE THE POPULATION IS (normal, bimodal, uniform, skewed, crazy.. )? If you make a histogram of a bunch of means taken from a bunch of samples, that histogram will be unimodal and symmetric WITH LARGE ENOUGH SAMPLES.. Close to normal. So.. A nerdy way to say it is: The sampling distribution of means is approximately normal no matter what the population is shaped like. The larger the sample size, the closer to normal. (the normal curve is just a model.. the sampling distribution is close to it, but not it! we use the model anyway!)

Answer 45

Your statistic. You stand at the point estimate and reach up and down to make an interval

Answer 46

The p is the p value from a hypothesis test assuming the y intercept and the slope are zero. If you put those T-STATS into tcdf, you get the p values from this column.

Answer 47

goodness of fit, test for homogeneity, test for independence

Answer 48

MUST USE CALCULATOR

Answer 49

WIDTH = 2 (Z) (SE)

Answer 50

as one increases, the other decreases, and vice versa? They have to because they BOTH ADD TO ONE!!! Power + Beta = 1

Answer 51

Combining models. From the square root of the added variances of the the sampling distributions of the 2 means

Answer 52

They know you stand at your stat and reach up and down? BUT?. People think that their statistic is in the middle of the pile of p hats or x bars? in reality, they are almost definitely NOT? they are out on one of the sides, and they are reaching up and down and trying to catch the center!!! YOU ARE NOT IN THE MIDDLE OF THE PILE!!

Answer 53

The model predicts that a person who doesn't study at all will score about a 45.3 on the test.

Answer 54

statistic +- margin of error ??.. Statistic +- (crit \* s.d )??. Stand at the statistic, reach out a margin of error, and hope that you catch the parameter.

Answer 55

some numerical summary of a population. Often called "the parameter of interest." It is what we are often trying to find.. It doesn't vary. It is out there and STUCK at some value, it is the truth, and you'll probably not ever know it! We try to catch them in our confidence intervals, but sometimes we don't (and we don't know it!). It Could be the mean of a population, the standard deviation of a population, the proportion of successes in a population, the slope calculated from a population, a difference of 2 means from 2 population, a difference of 2 proportions from population

Answer 56

1. Random and independence 2. Not too large, less than 10% of population, so 10n 10. . Too large samples have sampling distributions that are leptokurtic (narrower and taller than normal model), too small samples have skewed or other shaped sampling disributions.

Answer 57

for z crit.. INVNORM(area in 1 tail)? for t crit?. INVT(area in 1 tail, deg freedom)

Answer 58

if it just says "changed" or "different".. Then it is 2 sided.. DOUBLE THE P VALUE!If it says "more" "less than" "greater" etc.. Then it is just one sided..

Answer 59

T stat is the T score, the test statistic for null=0: (STAT-NULL)/SE (45.3 - 0) / 4.3 = 10.53 (5.82-0) / 2.66 = 2.189

Answer 60

1 or 2 samples? Proportions (z) or Means (t)? Test or Interval?

Answer 61

if r squared is .775, then you have to take sqrt of that.. So r= 0.8803

Answer 62

both are unimodal and symmetric. T models aren't as high and have more area in tails, that?s why you have to reach out a little further than z for same confidence. Bill sat on the normal model.

Answer 63

straight enough (check residuals for random scatter), random and independent, and look at the HISTOGRAM OF THE RESIDUALS and make sure they are unimodal and symmetric

Answer 64

(rows-1)(columns - 1) (remove a row and a colunmn an count the cells that are left)

Answer 65

On average, for each hour a student studied more than another student, their test scores were about 5.82 points higher.

Answer 66

a t or z score (or chi squared) that you use to find a p value

Answer 67

The natural variation of sample statistics.. NOT DATA.. Samples vary? so do their statistics.. Parameters do not vary!

Answer 68

xbar is your sample mean, mu is yor hypothesized population mean

Answer 69

for GOF: Exp %(total).. For indep and homog: ROW\*COL/TOTAL

Answer 70

T ratio is just SLOPE/ST ERROR and the p value is just TCDF(T ratio, 9999, n-2)

Answer 71

Pooling allows you to increase your sample size? sort of,

Answer 72

Type 2 error

Answer 73

it depends on the critical value.

Answer 74

independent groups, random, \<10% of pop and nearly normal.

Answer 75

It is the amount of standard deviations (errors) you'll reach out, depending on your confidence (a t or z). Example.. 68% crit z = 1 ?.. For 95% crit z = 2 (well, 1.96).. For means.. Use t crits

Answer 76

typical distance a statistic is from the parameter. The avereage distance to the middle in a sampling distribution. Called standard error because it is the typical error you would espect in a sample.

Answer 77

1. Make your Ho and Ha2. Make a Null Model (centered at null, use your Ho as center and in calculations, use your sample size).. This is a sampling distribution for the statistics if the null were true.3. CHECK? Calculate your statistic (p-hat, x-bar, phat1-phat2, xbar1-xbar2)

Answer 78

use p null

Answer 79

0.3 (UB+LB) / 2, avg of the numbers, in the middle

Answer 80

same as sampling variability.. The natural variability between STATISTICS.. NOT DATA!!! . We call it error EVEN THOUGH YOU MADE NO MISTAKES!!!

Answer 81

It is probability that you'll make a Type II error.. P(Type II error)

Answer 82

when you have 2 measurements on the same subject (or matched subjects). Often Before-After

Answer 83

Well.. Suppose you were trying to find the difference between the IQ of math teachers and IQ of English teachers. You sample 50 of each and find math xbar= 125 and English xbar=115. So, the difference of the samples is 10 points. That is your statistic. 10 points. You now have to add on a margin of error, let's say.. 4. so, youll say something like "I'm 90 % confident that math teachers score between 6 and 14 points higher on IQ tests."

Answer 84

our confidence lies in our interval? if we took another sample.. We'd have a different interval..

Answer 85

In all inference procedures: ANY CONFIDENCE INTERVAL OR HYPOTHESIS TEST (including chi squared and slope stuff)

Answer 86

They are an attempt to say what the true population parameter is.. It is our best guess? "We think that there will be between 6 and 12 inches of snow?"

Answer 87

it means NORMAL models centered at ?1 With a standard deviation of ?2

Answer 88

distance from statistic to parameter, how far you sample statistic is off from the truth.

Answer 89

It is the probability of getting your sample randomly if the null were true. Basically, how likely is it that your sample statistic came from the Null Model.

Answer 90

I don't know

Answer 91

NO!!! You have no idea where your interval is in regards to true parameter

Answer 92

Sure, I'm 100% confident that it will snow between 0 and 500 feet tomorrow?

Answer 93

Never accept a Ho

Answer 94

When you have reason to believe the variances of both populations are equal.

Answer 95

A distribution of a sample is just a histogram of the DATA in a sample. A sampling distribution is made from an bunch of sample STATISTICS. It is the distribution of the statistic that was calculated from those many many samples.

Answer 96

Population is the subjects you are interested in? Parameter is the actual number you want (like % of or AVG)

Answer 97

It is 2 margins of error wide

Answer 98

It is the same as z crit. It is the number of sd you reach out in your CI. To find it, do INVT(area in one tail, degrees of freedom)

Answer 99

To make an interval for prediction, plug 8 into equation.. 45.3 + 5.82 ( 8 ) = 91.86 stand there and go up and down CRIT S Use S because it is the st dev of resid. STAT +- CRIT (SE) 91.86 +\_ Tcrit (7.7) Tcrit is invt(.05, 73)

Answer 100

Histogram on calculator, or normal prob plot on calculator, .OR Boxplot should be symmetrical?

Answer 101

for smaller sample sizes, it must be plausible that the sample may have kind of come from a normalish population.

Answer 102

NO!!! Statistics do? they vary from sample to sample? PARAMETERS DO NOT VARY!

Answer 103

get a bigger net.. (wider conficence interval) or increase sample size

Answer 104

It is a sampling distribution. It tells us how sample statistics would vary if the null were true. It is centered at the null.

Answer 105

When you increase n, the pile of statistics gets more narrow, the statistics get closer to the true mean. Imagine your alpha beta power diagram with narrower piles, but the same distance from eachother, the power would increase!

ALL INFERENCE U5, U6 and U7 MIXED Flashcards

(141 cards)