Chance and data Flashcards
Bar graphs
Always support statements with statistical data from graphs)
- Shape
(Eg. Both graphs are similar shape because both dot plots are unimodal) - Symmetry
(Eg. Both dot plots are reasonably symmetrical, but both have a few older competitors which skews the distributions slightly to the right) - Shift
(Eg. The peak on the athletics graph is located higher up the age scale than that for Swimmers) - Overlap
(Eg. The ages of the middle 50% of the competitors are much the same) - Centre
(Eg. The median age of swimmers is younger than the median age of athletics competitors) - Spread
(Eg. The age range for athletics is larger than that for swimmers)
Writing probabilities
- Probabilities can be written as fractions, decimals or percentages
- Probabilities can not be less than 0 or greater than 1
Converting probabilities
• fraction > decimal (Divide numerator by denominator) • decimal > percentage (Multiply decimal by 100) • percentage > fraction (Write percentage as a fraction of / 100 and simplify)
Probability equation
Theoretical probability
Probability (event) = Number of favorable outcomes / Total possible number of outcomes
(Number of favorable outcomes is how many times the result should occur)
Probability equation
Experimental probability
Probability (event) = Number of outcomes / Total possible number of outcomes
(Number of outcomes is how many times the result did occur)
• This is for when the probability of an event is difficult or impossible to calculate. Many trials are done and the amount of times an event occurs is recorded. The true value of the probability will not be known, but the greater the number of trials, the closer the estimated probability will be to the actual probability.
Expected number of outcomes equation
Expected number of outcomes = Probability (event) x Number of trials
Combining probabilities
(Probability tree)
For calculating probabilities where several events occur
- Make a probability tree by deciding what the events are, and in what order they occur. Write the events at ends of the branches.
- Write the probabilities of each event on the middle of each branch, and check that these each add to 1.
- Calculate the probabilities at each end by multiplying the probabilities along each branch, and write the probability at the end of each branch.
- To find the overall probability for a question…
If one event occurs OR another event occurs > Add the probabilities
If one event occurs AND another event occurs > Multiply the probabilities.
Combining probabilities
(Two-way frequency table)
For calculating probabilities where several events occur
• When given a table with probabilities, first calculate the actual numbers and then fill them in using the number of the entire population given.
• To find the overall probability for a question…
If one event occurs OR another event occurs > Add the probabilities
If one event occurs AND another event occurs > Multiply the probabilities.
(Tick the boxes to which they apply to help)
Data handling
When collecting data we take a sample from a population
• A sample of 30 is considered to be sufficient for most purposes. A larger sample means you can have more confidence in findings.
• Bias occurs when some members of the population are more likely than others to be selected for the sample so that it does not accurately represent the population.
(Eg. ‘self-selected’ samples)
• To avoid bias, every member of the population has an equal chance of being sampled
(Eg. ‘random’ samples)
‘Self selected’ samples
‘Self selected’ samples occur if a member of the population decides whether they will be selected or not.
Eg. Ringing a radio station, filling in a form, going to a website to give feedback, completing a survey
(People may choose not to respond, and only those with an interest in the topic of the survey will be in the sample)
Eg. Surveying in a particular location
(Only those who go to that location, have time to stop and answer will be in the sample)
‘Random’ samples
‘Random’ samples occur if every member of the population has an equal chance of being selected.
Eg. Writing names on equal-sized pieces of paper and drawing them from a hat, giving every member of the population a number and using random numbers to decide who is selected, using random numbers to decide who will be selected from the electoral roll, or selecting every (5th) person as the (____).
Measures of center
Measures of center give a measure of where the middle of a distribution lies.
Mean, median, mode
• The median is middle data value and the best measure of centre for the data, as it is not distorted by very large or small values and is clearly able to be calculated for each set of data. Whereas, the mean is the sum of all data values / the total number of data values which represents the average data value and is therefore distorted by very large or small values. The mode is the data value which occurs most frequently, which is also unreliable as a measure of centre as often there are two or no modes (if there are more than 2 modes, there is no mode).
Measures of spread
Measures of spread give a measure of how widely spread the data is.
Upper quartile, lower quartile, inter-quartile range, range
• The inter quartile range is the difference between the upper and lower quartiles (IQR = UQ - LQ) and the best measure of spread for the data, as it is not distorted by very large or small values. Whereas, the range is the difference between the maximum and minimum values and is therefore distorted by very large or small minimum and maximum values.
Upper quartile and lower quartile
(UQ) The upper quartile is the middle data value of the top half of the data
(LQ) The lower quartile is the middle data value of the bottom half of the data
Displaying data
- Dot plots
A visual representation of each data point - Box plots
A visual representation of each 25% of the data
(Useful for representing data and comparing sets of data, but it does not show the distribution of all the data points and it is affected by very large or small values)