Applied maths Flashcards
Definition of population
Whole set of items that are of interest e.g. items manufactured in a factory or people living in a town
Definition of raw data
Information obtained by a population
Definition of census
Observes or measures every member of a population
Definition of parameter
Memorable characteristic of a population e.g. mean or standard deviation
Definition of sample
Selection of observations taken from a subset of the population which is used to find out info about the population as a whole
Definition of statistic
Single measure of some attribute of a sample e.g. mean value
Advantage of census vs sample
Complete and more accurate result
Disadvantages of census vs sample
Time consuming
Expensive
Hard to process large quantities of data
Cannot be used when testing process destroys item
Advantages of sample vs census
Less time consuming
Cheaper
Easier to process
Fewer people have to respond
Disadvantages of sample vs census
Less reliable
Could be biased
Different types of sampling methods
Simple random Systematic Stratified Cluster Opportunity Quota Self-selected
Simple random sampling
Any sampling method in which very member has an equal chance of being selected
Examples of simple random sampling
Numbering the population and using a random number generator
Selecting names from a hat
Systematic sampling
For a population of size N, to find a sample of size n we first set k=N/n. We now take a random member of the first k members, then take the kth member after that
Stratified sampling
This is when a population is divided into subgroups (called strata). A sample is then taken from each group of a size proportional to the group size
Cluster sampling
Used when a population can be divided into subgroups which are each reasonably representative of the whole population. Then we take a sample from just a few of those subgroups
Example of cluster sampling
Researcher wants to survey academic performance of students. Population could be divided by city and within the cities perform simple random or systematic
sampling
Opportunity sampling
This is used when you are unable to list a population. Member of a population are chosen for the sample as you have access to them
Example of opportunity sampling
Asking members of the public you see first
Quota sampling
This is used if you are unable to list a population, but you want to represent distinct groups within the sample. Use opportunity sampling until you have the specified size of sample for each group (or stratum)
Example of quota sampling
Interviewers meet and assess people before allocating them into the appropriate quota. This continues until all quotas have been filled - if someone refuses or their quotas is full you move onto the next person
Self-selected sampling
This is where the individuals in a population choose to be in a sample
Frequency density
Frequency/ class width
When is a distribution roughly symmetrical
Q2 - Q1 = Q3 - Q2
When is a distribution positively skewed
Q2-Q1
When is a distribution negatively skewed
Q2 - Q1 > Q3 - Q2
Outliers
Marked on box plot as asterisk Smaller than (Q1 - 1.5 * IQR) Larger than (Q3 + 1.5 * IQR)
Cumulative frequency diagrams
Plotted above the upper class boundaries of the intervals Points joined by straight line
Linear interpolation
To find the median: Lower class boundary + ((median value - values preceding median)/ values in interval) * class width
Sample space
Set of all possible outcomes
Example - in a test with 70 questions, the sample space for correct answers is {0, 1, 2, …, 70}
Event
Collection of some of the outcomes from an experiment
Example - getting >40 on the quiz
Relative frequency
No. of times event occurs/ number of times experiment is repeated
Mutually exclusive
If two events can’t occur at the same time
Independent events
If the occurrence of one has no effect on the probability on the second occurring
Conditional probability
Probability of event A happening given that event B has happened
Correlation
Measure of relationship
Variables in scatter graph
Independent variable (explanatory) is horizontal Dependent (response) variable is vertical
Correlation coefficient
-1 is perfect negative correlation
0 is no correlation
1 is perfect positive correlation
Discrete data
Can take any one of a finite set of categories or values, but nothing in between those values. Often the values are different categories
Continuous data
Always numbering and can take any value between two points on a number line
Probability distribution
Random experiment shows how the total probability of 1 is distributed between all the possible outcomes
Discrete distribution
Shown in a bar chart - height of each bar represents probability
Total height of all bars = 0
Conditions for binomial distribution
Two possible outcomes in each trial
Fixed number of trials (n)
Independent trials
The probability of a success (p) is constant