Intro Statistics Flashcards

Question

Variability of statistic

Answer 1

Determined by the spread of its sampling distribution. In general, larger samples will have smaller variability

Answer 2

mathematical representation of a random phenomenon. It is defined by its sample space, events within the sample space, and probabilities associated with each event.

Answer 3

set of all possible outcomes

Answer 4

numerical value assigned to a given event A. The probability of an event is written P(A), and describes the long-run relative frequency of the event. Rule 1: Any probability P(A) is a number between 0 and 1 (0 < P(A) < 1). Rule 2: The probability of the sample space S is equal to 1 (P(S) = 1).

Answer 5

If two events have no outcomes in common Rule 3: If two events A and B are disjoint, then the probability of either event is the sum of the probabilities of the two events: P(A or B) = P(A) + P(B). I’ll

Answer 6

chance of any (one or more) of two or more events occurring is called the union of the events. The probability of the union of disjoint events is the sum of their individual probabilities. If two events A and B are not disjoint, then the probability of their union (the event that A or B occurs) is equal to the sum of their probabilities minus the sum of their intersection.

Answer 7

Rule 4: The probability that any event A does not occur is P(Ac) = 1 - P(A).

Answer 8

If the outcome of the first event has no effect on the probability of the second event, Rule 5: If two events A and B are independent, then the probability of both events is the product of the probabilities for each event: P(A and B) = P(A)P(B)

Answer 9

chance of all of two or more events occurring For independent events, the probability of the intersection of two or more events is the product of the probabilities.

Answer 10

event B is the probability that the event will occur given the knowledge that an event A has already occurred If events A and B are not independent, then the probability of the intersection of A and B (the probability that both events occur) is defined by P(A and B) = P(A)P(B|A). From this definition, the conditional probability P(B|A) is easily obtained by dividing by P(A): P(B|A)= P(A and B)/P(A)

Answer 11

is a variable whose possible values are numerical outcomes of a random phenomenon

Answer 12

law of large numbers states that the observed random mean from an increasingly large number of observations of a random variable will always approach the distribution mean

Answer 13

Mu_a+by = a+b*mu_x Mu_x+y = mu_x + mu_y

Answer 14

Sigma^2_a+by = b^2*sigma^2 Sigma^2_x+y = sigma_x^2 + sigma_y^2

Answer 15

``` Mu_x = mu Sigma_x = sigma/sqrt(n) ``` Sample mean sigma gets smaller as n goes up If distribution of population is normal then distribution of sample mean is normal with mean mu and stdev sigma/sqrt(n)

Answer 16

two-sample z statistic from two normal populations of size n1 and n2 with unknown means and and known standard deviations and , the test statistic comparing the means is known as the two-sample z statistic z = ((x1 - x2) - (mu1 - mu2))/ sqrt((sigma1^2/n1) + (sigma2^2/n2))

Answer 17

two-sample t-statistic t = ((x1 - x2) - (mu1 - mu2))/ sqrt((s1^2/n1) + (s2^2/n2)) confidence interval (x1 - x2) +/- t*(sqrt(s1^2/n1 + s2^2/n2)) conservative P-values may be obtained using the t(k) distribution where k represents the smaller of n1-1 and n2-1

Answer 18

same variance for both s_p^2 = (n1 - 1)*s1^2 + (n2 -1)*s2^2/ (n1 + n2 - 2) t = ((x1 - x2) - (mu1 - mu2))/ s_p * sqrt((1/n1) + (1/n2))

Answer 19

given a simple random sample of size n from a population, the number of "successes" X divided by the sample size n gives us p_hat the sample proportion This proportion follows a binomial distribution with mean p and variance (p(1-p))/n An approximate level C confidence interval for p is p_hat +/- z* (sqrt((p*(1-p))/n) where z* is the upper (1-C)/2 critical value from the standard normal distribution.

Answer 20

For a population with unknown mean mu and known standard deviation sigma, a confidence interval for the population mean, based on a simple random sample (SRS) of size n, is x_bar +/- z*(sigma/sqrt(n)), where z* is the upper (1-C)/2 critical value for the standard normal distribution.

Answer 21

gives the probability that the interval produced by the method employed includes the true value of the parameter theta

Answer 22

For a population with unknown mean mu and unknown standard deviation, a confidence interval for the population mean, based on a simple random sample (SRS) of size n, is x_bar +/- t* (s/sqrt(n)), where t* is the upper (1-C)/2 critical value for the t distribution with n-1 degrees of freedom, t(n-1). s = standard error (estimated stddev)

Answer 23

For claims about a population mean from a population with a normal distribution or for any sample with large sample size n (for which the sample mean will follow a normal distribution by the Central Limit Theorem), if the standard deviation sigma is known, the appropriate significance test is known as the z-test, where the test statistic is defined as z = (z_bar - mu_theta)/(sigma/sqrt(n))

Answer 24

the probability that a fixed level significance test will reject the null hypothesis H0 when a particular alternative value of the parameter is true.

Answer 25

claims about a population mean from a population with a normal distribution or for any sample with large sample size n (for which the sample mean will follow a normal distribution by the Central Limit Theorem) with unknown standard deviation, the appropriate significance test is known as the t-test, where the test statistic is defined as t = (x_bar - mu_theta)/(s/sqrt(n))

Answer 26

o perform a sign test on matched pairs data, take the difference between the two measurements in each pair and count the number of non-zero differences n. Of these, count the number of positive differences X. Determine the probability of observing X positive differences for a B(n,1/2) distribution, and use this probability as a P-value for the null hypothesis.

Answer 27

To test the null hypothesis H0: p = p0 against a one- or two-sided alternative hypothesis Ha, replace p with p0 in the test statistic z = (p - p0)/(sqrt((p0*(1-p0))/n)

Answer 28

n = (z*/m)²p*(1-p*). The margin of error is maximized when p* = 0.5, in which case n = (z*/2m)².

Answer 29

An approximate level C confidence interval for p1 - p2 is p_hat1 - p_hat2 + z*sD where z* is the upper (1-C)/2 critical value from the standard normal distribution. sD = sqrt( (p1_hat(1-p1_hat)/n1) + (p2_hat(1-p2_hat)/n2)

Answer 30

To test the null hypothesis H0: p1 = p2 against a one- or two-sided alternative hypothesis Ha, first compute a pooled estimate for the parameter = (X1 + X2)/(n1 + n2), where X1 and X2 represent the number of "successes" in each population sample sP = sqrt(p_hat*(1-p_hat)*(1/n1 + 1/n2)) z = (1 - 2)/sp follows the standard normal distribution (with mean = 0 and standard deviation = 1). The test statistic z is used to compute the P-value

Answer 31

chi^2 = Sum( (observed - expected)^2/expected )

Answer 32

random variable is said to have a chi-square distribution with m degrees of freedom if it is the sum of the squares of m independent standard normal random variables he distribution of the chi-square test statistic based on k counts is approximately the chi-square distribution with m = k-1 degrees of freedom, denoted chi^2(k-1).

Answer 33

In general, if we estimate d parameters under the null hypothesis with k possible counts the degrees of freedom for the associated chi-square distribution will be k - 1 - d.

Answer 34

use the chi-square test to test the validity of a distribution assumed for a random phenomenon. The test evaluates the null hypotheses H0 (that the data are governed by the assumed distribution) against the alternative (that the data are not drawn from the assumed distribution). Let p1, p2, ..., pk denote the probabilities hypothesized for k possible outcomes. In n independent trials, we let Y1, Y2, ..., Yk denote the observed counts of each outcome which are to be The chi-square test statistic is qk-1 = = (Y1 - np1)² + (Y2 - np2)² + ... + (Yk - npk)² ---------- ---------- -------- np1 np2 npk Reject H0 if this value exceeds the upper critical value of the (k-1) distribution, where is the desired level of significance.

Answer 35

Permutations are for lists (order matters) | Combinations are for groups (order doesn’t matter)

Answer 36

P(n,k)=n!/(n-k)! You have n items and want to find the number of ways k of those items can be ordered N pick k

Answer 37

C(n,k) = n!/(k!(n-k!))

Answer 38

n!/(k1!*k2!*k3!...*km!) as the number of ways of depositing n distinct objects into m distinct bins, with k1 objects in the first bin, k2 objects in the second bin, and so on. the number of distinct ways to permute a multiset of n elements, and ki are the multiplicities of each of the distinct elements

Intro Statistics Flashcards

(62 cards)