Statistics/Data Set/Counting Flashcards
How do we measure the degree of a set of number spread out/ disperse?
calculate the range or standard deviation
how to calculate the range?
What does the range tell you?
Last - First (from smallest -> greatest value)
how is the range different from the number of integers of a data set?
the number of integers is the range + 1 or Last - First + 1
how to calculate the standard deviation
(1) find the arithmetic mean, (2) find the differences between the mean and each of the n numbers, (3) square each of the differences, (4) find the average of the squared differences, and (5) take the nonnegative square root of this average.
The greater the standard deviation, the _____
The lower SD, the ____
the more the data spread away from the mean
the more the date clusters toward their average
If all the elements of a set A are also elements of a set B, then
A is the subset of B
For any two sets A and B, the union of A and B (A ∪ B) is
the set of all elements that are in A or in B or in both.
The intersection of A and B (A ∩ B) is
the set of all elements that are in both A and B.
Two sets that have no elements in common are said to be
disjoint or mutually exclusive
If two sets A and B are NOT disjoint/ mutually exclusive, then |A ∪ B | is
|A|+ |B| - |A ∩ B|
If A and B are disjoint/ mutually exclusive then
|A ∪ B | = | A | + | B |
For experiments in which all the individual outcomes are EQUALLY LIKELY, the probability of an event E is
the number of outcomes in E divide by total number of possible outcomes
If the event “A and B” is impossible (that is, A ∩ B has no outcomes), then
A and B are said to be mutually exclusive
If A and B are mutually exclusive events, then P (A and B) and P (A or B)
P (A and B ) = 0
P (A or B) = P(A) + P(B)
Two events A and B are said to be independent if
the outcome of one event does not influence or affect the outcome of the other event
If any independent events A and B occur, then P (A and B) and P (A or B)
P(A and B) = P(A) x P(B)
P(A or B) = P (A) + P (B)
T or F. If P (A) + P (B) is greater than 1 then A and B are not mutually exclusive
True. Because probability can’t be greater than 1 so there exists P (A and B)
If a set contains consecutive integers (odd/even or evenly spaced), then the mean is ?
mean = median = (first number + last number) /2
What is the sum of n consecutive integers?
(mean of n) x n
The Σ of n consecutive integers is always divisible by n, when?
n is odd
T or F. When n consecutive integers is an even number, the average is never an integer
T
Since the sum is divisible by n, its mean is an integer
T or F. If a mean of n consecutive integers is an integer, n is an odd number
T.
T or F. If a mean of n consecutive integers is an integer, n is an odd number
T.
What does consecutive multiples set mean?
each element in the set is the result of increment of multiples
Ex: {12,16,20,24} is a set of consecutive multiple of 4
How do you express an equation in evenly spaced integers (6, 11, 16, 21)?
a(n) = a(1) + d (n-1)
The total probability of n independent events is
the product of all probabilities of those independent events
if two events are mutually exclusive, then probability of two events is
the sum of those two probabilities of events
What happen to the SD if we decrease/ increase in all elements of a set by a constant percentage/factor?
the SD will also increase/ decrease by the same percentage/factor
T or F. Increase/ decrease in all elements by a constant value will increase/ decrease SD
False
If there is a new element added to a set then
- new standard deviation is greater than original if
- new SD = old SD if
- new SD < old SD if
- the new value is farther from the mean or a new term more than 1 SD from the mean
- the new value is equal to the mean
- the new value is closer to the mean or a new term less than 1 SD from the mean
T or F. There is only one mode in a set of data
F.
There can be as many modes in a set of data
What does the frequency of a number in a set of data tell us?
the number of times does that number occur in a set of data.
How many integers are there from 14 to 765, inclusive? - what does inclusive mean?
It indicates that you need to include both numbers at the ends of the range in your count
What does “the probability from exactly one of the two events is 0.5” mean?
It indicates that the probability of one of those must be ZERO while that of other is 0.5
List S consists of 10 consecutive odd integers, and list T consists of 5 consecutive even integers. If the least integer in S is 7 more than the least integer in T, how much greater is the average (arithmetic mean) of the integers in S than the average of the integers in T?
What tool can you for the quickest way?
Pick number
How do you calculate the number of outcomes of flipping a coin with n times of flip?
each flip has two outcomes, so 2^n is the total number of outcomes
what is the unique character of probability for flipping coins?
Mirror number of outcomes between head & tail
i.e: 5 flips have: 5H, 0T + 4H,1T + 3H,2T = 2H,3T + 1H,4T + 0H,5T
4 flips: 4H, 0T + 3H,1T = 1H, 3T + 0H, 4T
T or F: we can’t use combinatoric rule to calculate the number of outcomes for a specific case of head & tail (i.e: What is the number of outcomes for exactly 2H & 2T in 4 flips)
F: We can because we can treat these H or T as the letters problem- H(1)TH(2)T is same as H(2)TH(1)T
Hence: 2H & 2T will have 4!/ 2!2! = 6 outcomes
What is your strategy?
An office manager must choose a five-digit lock code for the office door. The first and last digits of the code must be odd, and no repetition of digits is allowed. How many different lock codes are possible?
Always Prioritize the restriction first
- 5 (1st) x4 (last)x8x7x6 = 6,720
- 2nd, 3rd & 4th digit = 8,7,6 because two already used & no repeat
When you see items are pick or drawn random (not in order), you should..
A miniature gumball machine contains 7 blue, 5 green, and 4 red gumballs, which are identical except for their colors. If the machine dispenses three gumballs at random, what is the probability that it dispenses one gumball of each color?
take into the account of total cases (in different number of orders): 3 colors = 3! ways of arrangement. Therefore, total probability: (7/16)x(5/15)x(4/14) x 6
i.e: blue - red - green or BGR, RGB, GBR….
List K consists of 12 consecutive integers. If -4 is the least integer in list K, what is the range of the positive integer in list K?
1) L - F +1 = 12 -> L= 12+F+1= 12-4+1= 7
2) Rnage of the positive integer (1-> 7) = L - F = 7-1 =6
If I add or subtract a constant amount to each value in a set
The S.D doesn’t change
What happen to SD, If I multiply each value in a set by the same factor
The S.D will change proportionally
If there are 5 integers in termth: X(1) -> X(k) What is k term equal to?
5 integers = 5 terms, so the range is 4. Then, there are 4 extra terms from X(1) to reach X(k), hence:
k = 1+4 = 5. To confirm: L - F + 1 = 5th - 1st +1 =5
If there are 5 integers in termth: X(1) -> X(k) What is the total number of intervals among terms?
4 intervals (which is pretty much same thing as RANGE):
X(1)..X(2)
X(2)..X(3)
X(3)…X(4)
X(4)…X(5)
Find x(k) via equation expressed in # of terms & x(1)
In the sequence x0, x1, x2, …, xn, each term from x(1) to x(k) is 3 greater than the previous term. If x(1) = 3, what is the value of x(2), (3), (4)?
Use the interval formula: x(k) = 3 + 3(k-1) = 3k
T or F: The # of cases for x be even/odd is the permutation: 4!/ (2!2!)
If x is to be chosen at random from the set {1,2,3,4} and y is to be chosen at random from the set {5, 6, 7}, what is the probability that xy will be even?
False: By using the permutation, we already assumed the whole set of data is either even or odd, and this is not the case given the problem.
- Specifically, there are only two odd & two even numbers in set X. Therefore, there are only two ways (cases) of having x as odds or two cases of having x as even
Define the mode
the number that appears the most in a set of data
What is your strategy?
For the first month of a three-month period, Judy used 19 gallons of gas and her car averaged (arithmetic mean) 24 miles per gallon. For the second month of the three-month period, she used 31 gallons of gas and her car averaged 26 miles per gallon. At the end of the three-month period, Judy used a total of 72 gallons of gas and her car averaged 27 miles per gallon. How many miles per gallon did Judy average in the third month?
Baseline average technique - focus on the execution
What is the right set-up thinking? (Hint: NOT 3! x 2! x4!)
Claudio wants to arrange his book collection on a bookshelf such that all books of the same genre are grouped together and the order of the genres is always fantasy, biographies, and science fiction. He has 3 fantasy novels, 2 biographies, and 4 science fiction novels. How many ways can the books on his bookshelf be arranged?
In order of arrangement (Fantasy -> Bio-> SFiction): 3x2x4x2x1x3x1x2
What strategies can you use when you Avg Mean?
Last year Manfred received 26 paychecks. Each of his first 6 paychecks was $750; each of his remaining paychecks was $30 more than each of his first 6 paychecks. To the nearest dollar, what was the average (arithmetic mean) amount of his paychecks for the year?
1) Base line Average
2) Weighted Average
T or F: If difference between two of the numbers in the set is greater than 2, the range of that set must also be greater than 2
T - The range of a set is the difference between the largest and the smallest elements of the set:
- Since at the minimum, the largest number & smallest number could belong to those two numbers, the diff is still greater than 2
- At other cases, the largest number & smallest number extend out of range of those two, meaning the diff (range) is now even greatly apart
T or F: On a number line of between two positive numbers (i.e: 2 & 10), a number that is closer to the greater value (10) than to the smaller (2) will always be greater than the average of those two numbers (2 & 10)
T- the avg btw 2&10 is the middle point. Since a number is closer to 10 it must be further/greater than the middle point
How many integers are there from 14 to 765, inclusive? - what does inclusive mean?
It indicates that you need to include both numbers at the ends of the range in your count
The 10 students in a history class recently took an examination. What was the maximum score on the examination?
(1) The mean of the scores was 75.
(2) The standard deviation of the scores was 5.
Even with the SD, the max score can be within unlimited range of SD (i.e: within 1,2,3 & 4, hence Max = 80, 85, 90, 95)
Best approach?
What is the median number of employees assigned per project for the projects at Company Z?
(1) 25 percent of the projects at Company Z have 4 or more employees assigned to each project.
(2) 35 percent of the projects at Company Z have 2 or fewer employees assigned to each project
Draw the data map to visualize
Translate this into Algebra
The amount of Jean’s sales in the second half of 1988 averaged $10,000 per month more than in the first half.
1) Rephrase: For each month, the average of 2nd half of the year is $10,000 more than that of 1st half of the year
2) If X = average of first half, then:
- Total 1st half = 6x
- Total 2nd half = 6 (10,000+x)
Inference:
The sum of the 3 numbers is equal to 3 times one of the number
1) That number must the median & also mean of the set
2) The set must be three consecutive integers
The difference between the squares of consecutive integers (a & b) always equal to what amount?
since a^2 - b^2 = (a+b)(a-b) and a-b =1 the difference amount is a +b
i.e (4^2 - 3^2) = 7
A certain junior class has 1,000 students and a certain senior class has 800 students. Among these students, there are 60 siblings pairs, each consisting of 1 junior and 1 senior. If 1 student is to be selected at random from each class, what is the probability that the 2 students selected at will be a sibling pair?
A. 3/40,000
B. 1/3,600
C. 9/2,000
D. 1/60
E. 1/15
1) 60 sibling pairs = 60 students split each side -junior and senior
2) 1st pick sib prob x same sib prob = 60/1000 x 1/800 = 3/4000
Practice avg baseline technique
A certain bakery has 6 employees. It pays annual salaries of $14,000 to each of 2 employees, $16,000 to 1 employee, and $17,000 to each of the remaining 3 employees. The average (arithmetic mean) annual salary of these employees is closest to which of the following?
(A) $15,200
(B) $15,500
(C) $15,800
(D) $16,000
(E) $16,400
1) Pick $14k or 14 as the baseline, we have +2x1 (1 unit of 16 apart from 14) & +3x3 (3 units of 17 apart from 14) = 14 + 11/6 = 15 + 5/6 = 16 -1/3 or ans C)
2) Pick 16 as the baseline, we have -2x2 (2 units of 14 below 16) + 3x1 (1 unit of 17 above 16) = 16 - 1/6
Use your visualization to solve
If the average (arithmetic mean) of six numbers is 75, how many of the numbers are equal to 75 ?
(1) None of the six numbers is less than 75.
(2) None of the six numbers is greater than 75.
1) 75 is the minimum in the set then any other members greater than 75 will bring the avg > 75
2) 75 is the greatest in the set then any other members less than 75 will bring the avg < 75
Since we are given that avg = 75 then either statement are sufficient - every members have to be equal to 75
Don’t FALL into the C trap
What is this problem testing?
If a number between 0 and 1/2 is selected at random, which of the following will the number most likely be between?
A. 0 and 3/20
B. 3/20 and 1/5
C. 1/5 and 1/4
D. 1/4 and 3/10
E. 3/10 and 1/2
A range of possibilities: the widest range between two number will have the highest chance to have a number fall in to that range
Why is this statement alone sufficient?
If Z1, Z2, Z3, …, Zn is a sequence of consecutive positive integers, is the sum of all the integers in this sequence odd?
(1) (z1+ z2+ z3+…+zn)/n is an odd integer.
1) Because the mean in a sequence of consecutive integers is an integers, that means the number of members (n) must be odd
2) Sum = Mean x n = Odd x odd = ODD - sufficient
What analogy that you see from this problem?
If the probability of rain on any given day in City X is 50 percent, what is the probability that it rains on exactly 3 days in a 5-day period?
1) Its like flipping a coin problem: H or T -> Total outcome = (# of outcomes/turn)^ number of turns
2) Treating rain (R) or no rain (N) as letter problems
i.e: In this case, total = 2^5 = 32 and the number of desired outcomes = 5!/ 2!3! = 10. Hence
P (exactly 3 days) = 10/32 = 5/16
T or F: The smallest of any cannot be greater than the average of that set
T
T or F: (1) is sufficient
How many integers n are there such that r<n<s?
(1) s - r = 5
F
- w/ non-integer s & r: s = 11.1 - r =6.1, there are 5 integers
- w/ integers s &r: s= 11 - r =6, there are only 4 integers
- Key lesson: be wary of any problems w/o restrictions - in this case non-integers included
How to prove (1) is not sufficient
4, 6, 8, 10, 12, 14, 16, 18, 20, 22
List M (not shown) consists of 8 different integers, each of which is in the list shown. What is the standard deviation of the numbers in list M ?
(1) The average (arithmetic mean) of the numbers in list M is equal to the average of the numbers in the list shown.
1) Given (1): We have Avg M = 13 or ∑ of M = 104 (can consist multiple scenarios of combination)
- ∑ of the set = 13x10 = 130 -> To have ∑ of M, we remove a combination of numbers that equal to 26 = 130-104
2) Via combination:
- We remove 16 &10 and have ∑ of M = 104
- We remove 22 & 4, and still also have ∑ of M = 104
Consequently, either cases have same mean but different SD - NOT SUFFICIENT
What are some keys step to get the right answer
Triplets Adam, Bruce, and Charlie enter a triathlon. If there are 9 competitors in the triathlon, and medals are awarded for first, second, and third place, what is the probability that at least two of the triplets will win a medal?
1) P (at least two of triplets) = P (exactly 2 of 3) + P (exactly 3)
2) Draw the anagram grid to consider all combination possibilities:
- In the total number of cases for P (exactly 2 of 3), DON’T DISCOUNT other possible arrangements within non-triplets
-> Total cases = (Total # of cases for exactly 2 of triplets) X (Total # of cases for one of six non-triplet)
Why C is actually the answer choice and not E?
Four dollar amounts, w, x, y, and z, were invested in a business. Which amount was greatest’?
(1) y < z < x
(2) x was 25 percent of the total of
the four investments.
From (2), we know that X is the average of the sum of four numbers & since x> z >y there must exist a number greater than x to satisfy the arithmetic mean, so:
- w >x> z>y
We can also do the test cases to confirm: 25>24>23% min leaving 28% left for w
What is this problem testing?
At a garage sale, all of the prices of the items sold were different. If the price of a radio sold at the garage sale was both the 15th highest price and the 20th lowest price among the prices of the items sold, how many items were sold at the garage sale?
Ranking: Top Down & Bottom Up
Highest Price —> Radio <— Lowest Price
- From top down: Radio rank 15th (14 items that were more expensive than the radio)
- From Bottom up: Radio rank 20th (19 items that were less expensive than the radio)
Practice the critical thinking w/o setting up equation
Ada and Paul received their scores on three tests. On the first test, Ada’s score was 10 points higher than Paul’s score. On the second test, Ada’s score was 4 points higher than Paul’s score. If Paul ‘s average (arithmetic mean) score on the three tests was 3 points higher than Ada’s average score on the three tests, then Paul’s score on the third test was how many points higher than Ada’s score?
- Paul ‘s average score on the three tests was 3 points higher than Ada’s average score, so in total Paul scored 3*3 = 9 points more than Ada.
- On the first two tests, Ada scored 10 + 4 = 14 points more than Paul, thus in the final test Paul must score 23 points more than Ada’s to have that 9 pts difference
Can you infer whether the sum of consecutive integers is odd/even only from knowing that the number of integers (n) is odd or even
NO - depending on the starting point of the consecutive chain
i.e: 2,3,4 (Σ is odd) vs 3,4,5 (Σ is even)
- Both cases has n = odd, but depending on whether mean is odd/even, the Σ maybe odd/even
Why is this (2) sufficient
A certain company consists of three divisions, A, B, and C. Of the employees in the three divisions, the employees in Division C have the greatest average (arithmetic mean) annual salary. Is the average annual salary of the employees in the three divisions combined less than $55,000 ?
(2) The average annual salary of the employees in Division C is $55,000.
The average salary of the combined group will always be somewhere between the lowest and highest averages. For example, if the average salaries of groups are $10,000, $20,000, and $55,000, then the average salary of the combined group will be somewhere between $10,000 and $55,000. Since, according to the stem, the average salaries in divisions A and B are less than that in division C, the overall average when combined will be less than $55,000 because the divisions with lower average salaries will drag the combined average down.
- the number of employees in this case doesn’t matter
What is the most efficient way to solve?
List S consists of the positive integers that are multiples of 9 and are less than 100. What is the median of the integers in S?
1) We have set S includes multiple of 9 (9n, from 9 -> 99)
with n = 1,….,11, and hence we have 11 multiples of 9 in Set S
2) The median integer will be the median of n x 9: 6x9 = 54
Key lesson
If the sum of the first n positive integers is S, what is the sum of the first n positive even integers, in terms of S ?
Comprehension:
Let assume n= 6 then S will be the sum of the first 6 positive integers: 1,…,6
then Sum of the first 6 positive even integers will include: 2,4, 6, …,12