Summa Week 2 Flashcards
What does statistical inference involve?
statements about probability (e.g. probably, likely…)
What does the probability theory deduce?
it lets us deduce propositions about the likelihood of various outcomes, if certain conditions are TRUE
How is the probability of event A happening denoted by?
italicized p (A)
What is italicized p (A)?
the sum of the probability of all elementary outcomes of event A
How large can probability be?
0 less than p which is less than or equal to 1
What are three approaches to probability and statistical inference?
classical (analytical) approach
frequentist approach
subjective approach
What is an experiment?
tbd
What is the sample space?
the sample space of an experiment or random trial is the set of all possible outcomes or results of that experiment
What is an event?
an event is a set of outcomes of an experiment (a subset of the sample space) to which a probability is assigned.
What is an elementary outcome or a simple event?
an atomic event, or an outcome of an experiment to which a probability is assigned
What is a sample?
tbd
What is simple random sampling?
a simple random sample is a subset of a statistical population in which each member of the subset has an equal probability of being chosen
What are equally probable events?
outcomes of an experiment to which a probability is assigned that each have a chance of being chosen when randomly assigned
What is sampling without replacement?
like selecting tarot cards for a Celtic Cross spread, the results of which increase the likelihood of other
What is sampling with replacement?
tbd
What is the classical (analytic) approach to probability theory?
it makes certain assumptions (such as equally likely, independence) about a situation
e.g. if a sampling experiment has N possible outcomes (all equally likely to occur), this method would assign a probability of 1/N to each outcome
rolling a die
rolling dice and summing 2 numbers on top
What is an easy method to determine probability?
the tree diagram
e.g. tossing a coin and flipping it to record the outcome of H or T for each toss
Start - Head - Head-HH-p=.250
- Head-Tail-HT- p=.250
- Tail-Head-TH p=.250
- Tail-Tail-TT p=.250
True or false: If all possible outcomes are equally likely, the probability of the occurrence of an event is equal to the proportion of all possible outcomes favouring the event.
True
If we define the event as drawing a King from a deck of cards, what is the probability of the event of p(King)?
4/52
4 kings in a deck of 52 cards, therefore 4 out of 52
If we define the event as drawing the Ace of Spaces from a deck of cards, what is the probability of the event p(Ace of Spades)?
1/52
If we define the event as drawing a FAce card from a deck of cards, what is the probability of the event p(Face Card)=?
12/52 = 6/26 = 3/13
(there are 3 face cards per suit - Jack, Queen, King, and 4 suits per deck - spades, hearts, diamonds, club, therefore 3 x 4 = 12/52)
What is the probability of getting AT LEAST one correct answer for three true and false questions (assuming equal chances that the answer to a problem is correct or wrong)?
p(at least one correct answer) = 7/8 Do a tree diagram T-T-T T-T-F T-F-T T-F-F F-F-F - only one out of 8 that can be completely wrong, therefore 7/8 F-F-T F-T-F F-T-T
If the question was what is the probability of getting at least one answer wrong, then it would be 7/8 as well
If the question was what is the probability of getting at least two correct, then it would be 4/8 T-T-T T-T-F F-T-T T-F-T.
It would be the same amount for the probability of getting at least two wrong T-F-F F-F-F F-F-T F-T-F
What are the three relationships that can exist between two outcomes (probability)?
mutually exclusive (disjoint)
independent
conditional
What is a mutually exclusive or disjoint event (probability)?
two events that cannot logically occur at the same time, or events A and B intersect at zero/have a 0% chance of ever happening
e.g. a coin is tossed twice (H or T)
Start - HH, TH, TT, HT
Events A (HH and Event B (TT) are disjoint, therefore both cannot happen at the same time, whereas Events A (HT) and Event B (TH) are NOT disjoint
What is the probability rule for mutually exclusive/disjoint events?
If A and B are mutually exclusive/disjoint events, then p(A or B) = p(A) + p(B)
e.g. Drawing a heart OR a 3 from a deck of cards
= 13 hearts (ace,2,3,4,5,6,7,8,9,10,j,q,k) + 4 3s cards, - 3 of hearts = 16/52
What is also known as the mutually exclusive event?
a disjoint event
What is also known as a disjoint event?
a mutually exclusive event
What is the or/addition rule for probability?
p(A or B) = p(A) + p(B) - p(A and B)
e.g. the prob. of drawing either a heart or a spade from a deck of cards = 26/52
= 13 hearts/52 cards + 13 spades/52 cards - NO heart/spade cards 0/52 cards
= 26 cards/52 cards
What is the independent relationship between multiple outcomes in probability?
If Event A occurs that does not affect that probability of Event B occurring, then it is independent.
e.g.a tossing two coins allows EQUAL chances of the second toss being either H or T, regardless of the first toss
= p(A and B) = p(A)*p(B)
What is the independent probability rule?
If A and B are independent, then p(A and B) = p(A) * p(B)
e.g. tossing a H and then a T is 1/4
= 1H/2sides * 1T/2sides = 1Hx1T/4 sides = 1/4 = .250
Can a disjoint/mutually exclusive event be independent?
No! If A and B cannot occur together (or are disjoint), then knowing that A occurs DOES change the probability that B occurs
deck of cards tends to be disjoint UNLESS each card is placed back into the deck; coins seem to be independent most of the time
What is the prob of obtaining three heads with a three-coin toss?
p(three heads) = independent, therefore uses * [p(A) * p(B) * p(C) = independent] p(H) * p(H) * p(H) = 1/8 HHH -one out of 8, therefore 1/8 HHT HTT HTH TTT TTH THH THT
What is the permutation symbol?
!
it refers to taking the current number and MULTIPLYING it by the consecutively lower numbers until 1
e.g. 4! = 432*1=24
What is the permutation/number of possibilities for 4 letters?
Pr^N = N!/(N-r)! P4^4= 4! (4-4)! = 4!/0! = 4*3*2*1/1 = 24 e.g. 24 possibilities ABCD ABDC ACBD ACDB ADBC ADCB BACD BADC BCAD BCDA BDAC BDCA CABD CADB CBAD CBDA CDAB CDBA DABC DACB DBAC DBCA DCAB DCBA
What does the permutation 0! signify?
1.
it is what it is
What is the permutations formula in probability?
Pr^N = N!/(N-r)! e.g. permutations of letters A,B,C,D N = 4 r =4 P4^4 = 4!/(4-4)! = 4*3*2*1/0! = 24/1 = 24
What are the combinations for ABCD?
Cr^N = N!/r!(N-r)! C2^4 = 4!/2!(4-2)! = 4!/2!2! = 4*3*2*1/2*1*2*1 = 6
What is joint probability?
in N independent trials, suppose Na, Nb, Nab denote the number of times events A, B, and AB occur respectively. According to the frequency interpretation of probability, for large N
p(A) = Na/N
p(B) = Nb/N
p(A and B) = Nab/N
What is the probability that you will engage in unsafe sex and that your partner will have AIDS?
p(A and B) = Nab/N
What is conditional probability?
the probability of one outcome that is DEPENDENT on the occurrence of the other outcome
e.g. drawing cards and continuing to draw more WITHOUT replacing them
The probability of drawing 2 heart cards from a deck
p(B I A) = p(A and B) /p(A), or p(A and B) = p(A) * p(B I A)
What is the multiplication law of probability?
p(A and B) = p(A) * p(B I A)
What are some conditions of conditional probability?
The probability of an event A MUST often be modified after information is obtained as to whether or not a related event B has taken place
What is the probability of hypertension if overweight? = hypertensive and overweight .1 =not hypertensive and overweight ..15 = total .25 A = hypertension (dependent) B = overweight
A = hypertension B = overweight p(has hypertension given that overweight) = p(A I B) = .1/.25 = .4
What is the probability of being overweight given one has hypertension? A = overweight (dependent) B = hypertension = hypertensive and overweight is .1 = not hypertensive and overweight is .15
A = overweight B = hypertension p(A I B) = .1/.2 = .5? NEED TO CHECK
What is the frequentist approach for probability theory?
the limit of its relative frequency in a large number of trials. HOwever, the relative frequency of an experiment can be anything
e.g. drawing M&Ms from a bag and replacing them and repeating the same experiment has proportion closer and closer to the estimated probability, or “the limit”
What is considered the value of a proportion when used in a frequentist approach?
e.g. the limit for M&Ms to be picked out of a bag when replaced over repeated samples
the proportion that is estimated for the same that actual probabilities measured tend to move closer and closer to with repeated trials
What is the subjective approach in probability theory?
probability represents an individuals SUBJECTIVE belief in the likelihood of the occurrence of an event
e.g. I think that tomorrow will be a good day, based on what I know
True or false:
Although the particular definition that you or I prefer may be important to each of us none of the definitions will lead to essentially the same result in terms of hypothesis testing, the discussion of which runs through the rest of the book. (It should be said that those who favour subjective probabilities often disagree with the general hypothesis-testing orientation.)
False. The definitions will essentially have the SAME result, they just have different approaches to them that are more helpful for certain scenarios
What is an a priori probability?
the number of events classified as A over the total number of possible events
i.e. p(A) = # of events A/total # of possible events
What is the a priori probability of a head on one toss of an UNbiased coin?
.5
the number of events classified as heads or 1 divided by the total number of possible events of heads plus tails, or 2
= 1/2
= .5
What is the a posteriori probability?
the number of times A has occurred over the total number of possible events
= p(A) = # of times A has occurred/total # of possible events
If I toss any coin ten times and I get 4 heads, what is the a posteriori probability of heads?
.4
= p(A) = # of times A has occurred (# of heads to have occurred, or 4 divided by the total # of possible events or 10
= 4/10
=.4
What does hypothesis testing compare a posteriori probability with?
a priori probability
Who was concerned with integrating “prior knowledge” into calculations of probability?
Thomas Bayes, who created the Bayes Theorem or Bayesian approach
What is the Bayesian approach?
a probability theory that is concerned with integrating “prior knowledge” into calculations of probability
What formula is this of:?
p(A/B) = [p(B/A)p(A)]/[p(B/A)p(A) + p(B/A)p(A)]
the Bayesian approach
Does this formula show if the hypothesis is true or false?
H = not H
D = certain given data
p(H/D) = [p(D/H)p(H)]/p(D/H)p(H) + p(D/H)p(H)
True.
…?
What are discrete probability distributions?
the probability of a SPECIFIC outcome
If a variable can take on one of a relatively small number of possible values, what kind of variable is it assumed to be?
a DISCRETE variable
What are five-point scales or socio-economic status (multinominal probability distributions) examples of?
discrete probability distributions (they are exact and specific)
What are continuous probability distributions?
the probability of obtaining a value that falls within a specific interval
What types of distributions are these examples of?
normal probability distribution, student’s t-distribution, chi-square distribution, F-distribution
continuous probability distributions
The rule that says that the probability of a series of outcomes occurring on successive trials is the product of their individual probabilities is the _______ rule
multiplication/joint/independent
The rule that says that the probability of one outcome or the other outcome occurring on a particular trials is the sum of their individual probabilities is the ________ rule
addition/mutually exclusive/disjoint
The and rule is to _________ and the or rule is to __
a) multiplication rule; addition rule
b) addition rule; multiplication rule
c) multiplication rule; multiplication rule
d) addition rule; addition rule.
multiplication;addition
The probability of rolling either a 2 or a 5 on one roll of a standard die is:
a) .34
b) .25
c) .50
d) .03
a
The probability of rolling a 2 followed by a 6 on a standard die is:
a) .34
b) .25
c) .50
d) .03
d
A(n) ______ scale is a scale in which objects or individuals are broken into categories that have no numerical properties
nominal
A(n) _____ scale is a scale in which the units of measurement between the numbers on the scale are all equal in size
interval
Measures of _____ are numbers intended to characterize an entire distribution
central tendency
The ______ is the middle score in a distribution after the scores have been arranged from highest to lowest or lowest to highest
median
When mean and median are the same, the distribution has to be ______
symmetrical
Measures of ___ are numbers that indicate how dispersed scores are around the mean of the distribution
variation
When we divide the squared deviation scores by N - 1 rather than by N, we are using the ____ of the population standard deviation
unbiased estimator / degrees of freedom
s represents the __ standard deviation and o represents the _ standard deviation
sample; population
A distribution in which the peak is to the left of the centre point and the tail extends towards the right is a __ skewed distribution
positively or right
On average, __ statistic has the same value as the population parameter
an unbiased sample, and a POINT ESTIMATE
Letter grade on a test is to the __ scale of measurement and height is to the __ scale of measurement
a. ordinal, ratio
b. ordinal, nominal
c. nominal, interval
d. interval, ratio
a. ordinal, ratio
Weight is to the ___ scale of measurement and political affiliation is to the __ scale of measurement
a. ratio, ordinal
b. ratio, noinal
c. interval, nominal
d. ordinal, ratio
b. ratio, nominal
Qualitative variable is to quantitative variable as ____ is to _____
a. categorical variable, numerical variable
b. numerical variable, categorical variable
c. bar graph, histogram
d. categorical variable and bar graph; numerical variable and histogram
d!!!!!!!!!!!!!!
Inferential statistics allow us to infer something about the \_\_\_\_ based on the \_\_\_\_\_ a sample, population b. population, sample c. sample, sample d. population, population
b. population, sample
Which of the following is not true?
a. All scores in the distribution are used in the calculation of the range
b. the average deviation is a more sophisticated measure of variation than the range; however, it may not weight extreme scores adequately
c. The standard deviation is the most sophisticated measure of variation because all scores in the distribution are used and because it weights extreme scores adequately
d. None of the other alternatives is false
D?
If the shape of a frequency distribution is lopsided, with a long tail projecting longer to the left than to the right, how would the distribution be skewed? a. Normally B. Negatively c. Positively d. Average
b. Negatively
Calculate the mean for the following distribution:
1, 1, 2, 2, 4, 5, 8, 9, 10, 11, 11, 11
6.25
Calculate the median for the following distribution:
1, 1, 2, 2, 4, 5, 8, 9, 10, 11, 11, 11
6.5
Calculate the mode for the following distribution:
1, 1, 2, 2, 4, 5, 8, 9, 10, 11, 11, 11
11
Calculate the range for the following distribution: 2, 2, 3, 4, 5, 6, 7, 8, 8.
6
Calculate the standard deviation for the following distribution: 2, 2, 3, 4, 5, 6, 7, 8, 8.
2.4
____ is the study of likelihood and uncertainty
Probability
The rule that says that the probability of a series of outcomes occurring on successive trials is the product of their individual probabilities is the ____ rule
multiplication
The rule that says that the probability of one outcome or the other outcome occurring on a particular trial is the sum of their individual probabilities is the ___ rule
addition
The and rule is to _____ and the or rule is to ___
a. multiplication ;addition
b. addition; multiplication
c. multiplication; multiplication
d. addition; addition
a.
The probability of rolling either a 2 or a 5 on one roll of a standard die is:
a. .34
b. .25
c. .50
d. .03
a. .34
The probability of rolling a 2 followed by a 6 on a standard die is:
a. .34
b. .25
c. .50
d. .03
d. .03
Let’s say Bill has an IQ of 145 and is 52 inches tall
- IQ in the population has a mean of 100 and a standard deviation of 15
Height in the population has a mean of 64 inches with a standard deviation of 4
How many standard deviations is Bill away from the average IQ?
145-100/15 = 3 sd away from the average IQ
Let’s say Bill has an IQ of 145 and is 52 inches tall
- IQ in the population has a mean of 100 and a standard deviation of 15
Height in the population has a mean of 64 inches with a standard deviation of 4
What is the Z score of Bill’s height? Or what is Bill’s height in Z scores?
52-64/4 = -3, which is significant. This guy is short!
Consider two sections of statistics:
- Gurnsey’s class has a mean of 80, and s of 5; Marcantoni’s class has a mean of 70 and s of 5
- Student 1 gets 80 in Gurnsey’s class; student 2 gets 75 in Marcantoni’s class
Which student did better?
Student 2 did better than Student 1
80 - 80/5 = 0
75-70/5 = 1
A ___ test is used when s and m are known and the sample is 30 or larger.
Z-test
Which of the following is an assumption of the Z test?
a. The data should be ordinal or nominal
b. The population distribution of scores should be normal
c. The population mean (m) is known, but not the standard deviation (s)
d. The sample size is typically less than 30
b
What is an interval estimate?
the range of datapoints that are likely to be within the distribution of the population, and therefore if the sampling distribution is found within this range, then we are more apt to fail to reject/accept the null hypothesis, not being able to show a reasonable effect between the two distributions
If an interval estimate is within the critical value range of the confidence interval of a population, what can we assume?
that the test hypothesis is such that it is statistically significant and indicative of an effect within the set alpha (need to note whether .-5, etc. and whether 1 or 2-tailed)
What size must the ztest statistic be in order to conclude a statistically significant result?
over 1.96 or under -1.96
What size must the ztest statistic be in order to conclude a statistically INSIGNIFICANT result?
under 1.96 or above -1.96
What do I always have to remember the standard deviation is divided by in order to create a population parameter that is UNBIASED?
divide it by the unbiased estimator, or the degrees of freedom
A wide interval estimate means we have __ certainty of the mean
LESS
What is the coefficient of variance?
the interval estimate divided by the standard error
what does a statistically INsignificant result at alpha suggest?
samples 95% of the time capture the true population mean; the true population mean is within the confidence interval
If we reject the null hypothesis, we assume the data shows evidence to suggest that the ___ hypothesis is _____. Otherwise, we assume the ____ hypothesis is ____, unless shown otherwise.
the null hypothesis is false, otherwise we assume it is true