Lecture Six Flashcards
What is the purpose of statistical inference
to obtain information about a population from information contained in a sample
What is population
the set of ALL elements of interest
What is a sample
a subset of the population
What do the sample results provide
only ESTIMATES of the values of the population characteristics
What can proper sampling methods do
sample returns can provide GOOD estimates of the population characteristics
why do we work with samples
in statistical inference it is too costly in terms of time and money to work in the population
-more easier & more efficient
-costly and sometimes not possible
-e.g. the census it is a huge amount of money and time implemented to check every single person in the population
=to reduce time and effort work the sample
what is the true value
population
What is the idea when working with samples
-idea is to get the inference closes to the population (true value)
-its important for decisions made
Why do we work with random sampling
-to avoid bias
-allows probability to make inferences about unknown population parameters (e.g. mean/variance).
-only if they are random otherwise no basis for using probability to make this inference
What is the reason for the probability when working with random variables
-what is the probability that the estimate of the sample variance is the estimate
What shape is the population data graph
a bell shape
-e.g. selecting the first 1000 for the sample then the next 1000 and etc – the first second and last part
what happens with the sample graph shape if it is accurate
when properly take a sample = resembles the population shape
-if shape is similar get close values
what happens with the sample graph shape if it is not accurate
-when there’s mistakes – doesn’t represent the entire section
-not shaped similar to the population
-the shape would be different = values are not the same to the population
what is meant by arbitrary sampling
a non-random sampling method where units are selected in a haphazard manner, with little or no planning.
Why is arbitrary sampling bad
Arbitrary sampling is biased, and the results are speculative
What would a arbitrary sample graph look like
different - not similar to the population = inaccurate representation
Finite population defined by? and examples
often defined by lists:
-organisation membership roster
-credit card account numbers
what happens when a simple random sample size of n from a finite population of size N
is the sample selected such that each possible sample of size n has the same probability of being selected
What is sampling with replacement
replacing each sampled element before selecting subsequent elements
-procedure used most often
is finite population bias
yes
What are infinite populations defined by
an ongoing process where the elements of the population consist of items generated as though the process would operate indefinitely
what are the conditions when selecting a simple random sample from an infinite population
-each element comes from the SAME population
-each element is selected independently
is it easy to get all of the elements in population for a infinite population
no, it is impossible to obtain a list of ALL elements in the population
-e.g. human ppl die ppl born – numbers constantly changing
=impossible to track values
can random number selection procedure be used for infinite population
no.
how is simple random sampling chosen
chosen by a process that selects a sample of n objects from a population in such a way that each member of a population has the same probability of being selected
find the simple random sample if population size is N=5, and we want a sample size of n=2
Total number of pairs we can have are (5,2) = 10
-therefore each pair has a probability of 1/10
-can use Uniform distribution: Unif(1,10) where 1 and 10 are the upper and lower limit
what does systematic sampling do
provide a convenient way to choose a random sample as every kth member is selected from a list/other ordering
what is the systemtic sampling when population N=55000 (names listed in alphabetical order) and we want a random sample of n = 250 names,
systematic sampling would select every k = N/n member of the population = 55000/250 = 220
-there are 220 different possible samples - depending on the first number chosen = each is equal likely
an examples of when businesses may use systematic sampling
when businesses are auditing - they need to chose at random companies to audit
an example of stratified sampling
to obtain a stratified random sample acorrding to age - small age groups can be formed
-first group (0-5)
-second group (6-10)
-third group (11-15)
-fourth (16,20) etc.
once groups identified - each group is formed by using the simple random sample approach
what is the stratified sampling
if condition is unevenly distributed in a population with respect to age, gender or some other variable
where are stratified sampling common in
portfolio managers, hedge funds
=e.g. very risky stocks represents the group/ liquid stocks
-having liquid stock – need a premium to hold the stock to get a larger return (
what is meant by point estimation
use data from the sample to compute a value of a sample statistic that serves as a estimate of a population parameter
what do we refer to as the x with the line ontop
point estimator of the population mean - mew
What is Sx in the point estimator contect
Sx = point estimator of population standard deviation
what is p with line ontop
point estimator of population proportion p
when is the point estimator unbiased
when the expected value of a point estimator is equal to the population parameter
What are the two sources of errors that can occur when sampling randomly from a population
sampling error and non sampling error
what is the sampling error
inevitable result of basing an inference on a random sample rather then on the entire population non sampling error
when does the non response bias occur
when a portion of the sample fails to respond to the servey
when does measurement error occur
when the responses to the question do not reflect what the investigator has in mind
-when the variables considered are estimates rather then true values
what is the sample error for the sample mean (equation)
x (with line ontop) − μ
what is the sample error for standard deviation
SX − σ
what is the sample error for sample proportion
p(with line ontop) − p
sample mean of random variables is
sum of all observations over 1/n
what type pf distribution can x be
normal , uniform or any type
what should the sample mean equal to
the population mean
what happens to the sample variance if n increases
the variance shrinks - closer to 0
what is true data
population data,
how does sample data become closer to the true value
-needs a huge amount of data = the true value
-more data = more true
what is the standard error
sqrt(sample v/n)
what changes the value of the standard error
more values in the sample data
how to know if sample members are independently distributed of one another
if sample size n is not a small fraction of the population size N
what is the variance of sample mean if observations are not selected independently
variance / n times N-n/N-1
what is the finite population correction factor
(N-n)/(N-1)
when is finite population treated as infinite population
if n/N is smaller or equal to 0.05
what must u assume about infinite and finite variance
always infinite