QM LM7 Estimation and inference Flashcards
What is a sample?
- A method of obtaining information about a population’s parameters (mu and sigma)
- Through sample statistics (XBar and S)
What is probability sampling?
Where every member of a population has an equal chance of being selected
- Therefore samples will be more representative of the population
What is simple random sampling?
- A form of probability sampling
- Where a subset of a larger population is created such that each element has an equal probabiltiy of being selected
- E.g. if population n = 500
- Random number generator selects 50 numbers between 1 and 500
- This creates a sample of 50
- This method is useful when data are homogeneous
What is systematic sampling?
- A form of probability sampling used when the population is too large to code
- Select every kth element until the desired sample size is reached
- When an auditor audits a company’s accounts it might look at every 10th accounts receivable because there are so many it is impractical to look at all
What is stratified random sampling?
- A form of probability sampling used where the population is sub-divided based on one or more classifications
- I.e., if surveying a large group of people we might subdivide by sex, age, and income level
- Each sub sample is proportionate to the size of its sub population
- This guarantees that population subdivisions are represented in the sample, making the statistics more precise
- Simple random samples are drawn from each sub population, and each sample is then pooled to form the main sample
What is sampling error?
- The difference between observed values of a statistic and population parameters
- As a result of using just a subset of a population
How do we find sampling distribution of the sample means?
- Take many samples from a population
- Find their means. Their means will differ and themselves be random variables
- Put their means together, they will form an approximately normal distribution
- Find the standard deviation of this distribution
- Done!
What is standard error?
- take the sample standard deviation and divide by square root n of our sample’s size
- Precision we can attach to our estimate created by sampling the population
What is cluster sampling?
Where population is divided into clusters, each of which is a mini representation
- Certain clusters are then selected as a whole using simple sampling. This is called “one stage cluster sampling”.
- If we sample WITHIN each cluster as well, this is called two stage cluster sampling
What are the drawbacks of cluster sampling?
- Usually results in lowest precision since a cluster may not be representative of the population
- Is however time and cost effective
What is non probability sampling?
- Depends on factors such as judgement or convenience (in terms of access to data)
- Runs the risk that samples may be non representatve
What is convenience sampling?
- A form of non probability sampling
- Observations are selected that are easy to obtain or are accessible
- Not necessarily representative, but low cost
What is judgemental sampling?
- A form of non probability sampling
- Select observations based on experience and knowledge
- useful when there is a time constraint and/or the specialty of the researcher would result in better representation
- I.e., during audit an auditor may look at specific accounts or kinds of transactions with the knowledge that if these are okay usually the rest are okay
How do we estimate population mean based on samples?
- Take the mean of the sample means
What happens when we take many samples from the population?
- As the n of our samples increases the distribution of sample means (when they are plotted on a histogram) the tails shrink and head gets taller
- When sample size is something like 1000 the sampling distribution of the sample means will almost be a straight line up the centre and will be very accurate