Probability Flashcards
If you can’t find a quadratic , but they gave you info, what to do
Multiply two quadratics and equate them to the other in thr probability tree as they sum to 0
Then can solve and using information it will cancel out an ootion
They pick the easiest way as 2 are releated
Remember probability cindtionsl formula
Probability a And B/ B
Remember ti use competent of probability ti make it easier
1- complement = 0
What is a discrete random variable
A variable for which a lost of possible numerical values can be made
List all outcomes can be made
Write down their probabilities
Sum to 1
Categorical vs numerical
Discrete bs continoud
Catbroical is data that has no meaning behind numbers, even if it’s numbers indicating, have no numerical meaning
Numerical had meaning snd can be manipulated
2) discrete is when all data can be listed
Continoud is if there are infinite data points between a range like heights
Frequency tables,
Make sure that boundaries defined
And that values of boundaries aren’t too far cuz thenn you lose feel
Stem and leaf
- keeps raw data
- and ordered so you can find median and quantiles
- also can compare two sets back to back
- lentgh of lines of leaves give the shape of each distribution
Bar chart vs pie chart
Bar chart keeps original feewuencies
Pie chart scaled to promotions which is easy to compare two sets of data, but loses orignal data values . Easy to compare because two sets of bar charts might have differnt amounts of data
Both for categorical
Vertical line chart takes the misconception that width if a bar chart actually means anything
Dot plot allows for quick understanding of data, the same thing as a vertical line graph
How ti plot continoud data
- frequency against width?
- histograms
Frequency against width gives a distorted image when the class widths ARENT equal
- here the area seems to be rrpresentive of total value in the rn she
- that’s why we need the area to be exactly = to the frequency, so we use frequency density against clas width, so area = frequency
Now you can compare the different things through area
Here thr area is PROPTIONAL to frequency, might be a constsnt factor there too
Positive and negative skew
If more data positive = negative skew
If more data negative = posting Skew
If around the mean , then symmetrical
Range vs midrange
Range is diff between highest and lowest
Midrange is HIGHEST + LOWEST/2
Mean vs mode vs range vs midrange
Mean takes in consideration if all data, so can easily be susceptible to outliers
Median is better if there are outliers as it’s the middle street value if side. If mean is good repsrentstion if average then fine if not median
Mode just gives most frequent, only useful if there are repeating frequencies
Midrange also susceptible to outliers, and assumes data symmetrical , easy to find tho
Need to decide how many vs,used the median is close by with snd if we should take outlier into consideration
How ti find median frequency table, grouped be non grouped
NON GROUPED = add 1 to frequency /2, so 20 = 10.5, find the 10.5 value
Grouped = divide by 2, and linear interpolate
Intervals for estimating means
If it’s 50-59, and 60 to 69
Then it wouldn’t be 50 till 60, 60 till 70, because anything above 59 CANT BE IN THAT RANGE
Thus they were rounded DOWN, so it would be less than 59.5, so everything becomes rounded down !
If it’s ages, then you are 29 till day before 30, so it would be 20 to 30, with middle as 25
Grouped data median
Divide frequency by 2
And interpolate
Thus bevause we using cumulative frequency