Statistics Flashcards
there are many problem variations that can occur where you need to use the arithmetic mean formula where you may be given the mean, and then a variable is inserted into the (sum of set of elements)/(# of elements) part
arithmetic mean?
sum of set of elements/ # of elements
how do you deal with situations where you are total group is broken up and you are given information about averages of sub components of the group and about the group as a whole and asked for average information about the other subcomponent
what is an evenly spaced set
numbers in the set increase or decrease by the same amount, and thus, share a common difference
how do you count the number of consecutive integers in a set (inclusive of the first and last elements)?
highest number - lowest number +1
ex: how many numbers from 50 to 101 inclusive? 101-50+1=52
how do you count # of consecutive multiples in a set, inclusive?
((highest # divisible by the multiple - lowest # divisible by the multiple)/multiple number)+1
ex: number of multiples of 3 between 1 and 100 inclusive is (100-1)/3 + 1
how do you count the number of consecutive items in a set (including only one of the end points of the set)?
simply subtract smallest number in the set by the biggest number
how do you count the number of consecutive integers between the first and last numbers in a set
biggest - smallest - 1
what is the bookend method for finding the arithmetic mean?
if you have a set of evenly spaced numbers, the arithmetic mean is (largest+smallest)/2
what is the balance point method for finding the arithmetic mean?
for set with even number of terms: balance point method means you find the two middlemost terms and take their average
for set with odd number of terms: average of the set is the middlemost point
how do you calculate the number of multiples of A or B in a set of consecutive integers?
multiples of A + # multiples of B - # multiples of LCM(A,B)
LCM(A,B) must be removed since those numbers will be duplicated in each list and we want them to only appear once
how do you calculate the number of multiples of A or B BUT NOT OF BOTH in a set of consecutive integers?
[multiples of A] + [# multiples of B] - 2*[# multiples of LCM(A,B)]
similar to calculating the number of multiples of A or B, but you must remove all instances of the LCM(A,B) by removing it twice
weighted average = ?
((data pt 1)(frequency of data pt 1)+(data pt 2)(frequency of data pt 2) + ….)/(total frequency of data points)
what assertions can be made about a weighted average when there are differences in the frequency of a certain data point?
the weighted average of two different data points will be closer to the data point with the greater frequency, number of observations, or weighted percentage
how can you still calculate weighted averages if all you have is the data points and the ratio of the prevalence of the two groups?
given the value of two data points and the ratio of the quantity of the two data points, we can calculate the weighted average
what is the median of a set of numbers?
if set has an odd number of items: it is the number that is exactly in the middle after the set has been ordered
if set has an even number of items: order the set of numbers, then take the average of the two middle numbers - that is the median
how do you quickly determine the POSITION of the median of a large ordered set with an odd number of items?
for a set with n items (where n is odd), the median is in the (n+1)/2 position of the ordered set
how do you quickly determine the POSITION of the median of a large ordered set with an even number of items?
for a set with n items (where n is even), the median the average of the values at the (n+2)/2 and n/2 positions of the ordered set
when are the mean and median of a set of numbers the same?
if the set of numbers is equally spaced
what is the mode?
the number that occurs most frequently in a set
can a set have more than one mode?
yes, if two or more numbers appear more frequently in a dateset than other numbers they are all modes
can a dataset have no mode?
yes, if all numbers occur with the same frequency
What is the range of a dataset?
[highest value in set] - [lowest value in set]
what does standard deviation measure?
how far a set of values are from the arithmetic mean of that data set
establishing ranges based on the mean and how many standard deviations from the mean you want to be:
high value = mean + [x # of st dev][st dev]
low value = mean - [x # of st dev][st dev]
does the standard deviation change between two datasets if one dataset is simply the original dataset plus some constant value added to each term?
no, in this case the mean would change but the actual standard deviation wouldn’t change cause the center of gravity of the distribution simply shifted, the spread between points did not
does the standard deviation change between two datasets if one of the datasets is simply the original dataset multiplied or divided by some constant value?
yes, the st dev changes by the factor of that constant value
fact:
if we multiply or divide the elements of a dataset by a constant amount, the standard deviation will also be multiplied or divided by that constant amount
what is the minimum possible standard deviation of a set?
zero
what is one thing you can do to a set of numbers with a positive (non zero) standard deviation to decrease the standard deviation of the set?
add elements to the set that are equal to the mean of the set
How to compare standard deviations of data set with an equal number of data points?
Step 1: Determine the mean of each set
Step 2: For each set, determine the absolute difference between the mean of the sets and each individual point in the sets (sum of differences between mean of each set and points in each set)
Step 3: Sum the differences in each set
Step 4: The set with the greatest sum of differences has the greatest st dev
when will the standard deviation of a set be zero
when all the points in the set are the same
if the smallest and largest values of a set are the same, what does that say about the range and st dev of the set?
range = 0 -> all values in set are the same -> st dev = 0
what would it mean if the largest or smallest value of a dataset is equal to the mean of the data set?
this means that all the points in the dataset are the same and thus the range =0 and st dev = 0 .. proof: a set of 4 numbers has a mean of 10 and the largest value of the set is 10. since ave = sum/#, sum = ave# = 104 = 40. so let the four values sum to 40 -> 10+a+b+c=40, so a+b+c=30. since no number can be greater than 10, all of a, b, and c must equal 10 for the equality to hold given the restriction
when can you be certain that the st dev of a data set is not zero (and that it is some value greater than zero)
when not all the values of the set are equal
what criteria are such that if any of the three are true, the st dev is greater than zero?
1) the range is not zero
2) the smallest and largest numbers in the set are not equal
3) the smallest or largest number in the set is not equal to the mean of the set