Foundation of Data Analytics Flashcards
Parameter Def.
Value of a variable that characterises a population
Statistics Def.
Value of a variable that characterises a sample
Population Parameter Examples
Mean (mew) and standard deviation
Sample Parameter Examples
Mean (x bar) and standard error
Probability Sampling Def.
Every member of population has a known chance of being selected. Warrants better representation
Sampling Bias
Some members of a population are systemically more/less likely to be chosen. Forms a biased sample
Uniform Occurrence
All variables are equally likely to be chosen
Sampling Distribution Def.
Compilations of means of different samples across a population. The mean of a sampling distribution curve should be close to population mean. Always normally distributed. Gives support for inferences from sample to population
Statistical Dispersion Def.
Fluctuations in means in samples of the same population. Caused by sample size (bigger N = lower fluctuation) and original population variation/outliers (smaller sd = smaller variations)
CLT States
Sample mean should be similar to pop mean, standard error should be smaller then standard deviation and sample means will alwys be normally distributed (and the mean of the sampling distribution should be equal to pop mean)
Standard Error Relationship To Accuracy of sample to pop representation
Large standard error = sample mean is inaccurate representation of true pop parameter
Random Variation
How unexpalanable variation can be accounted for and modeled. May be produced via observation through random sampling
What is sampling error based on sampling distribution
Estimates random sampling error
Interval Estimates
Select a range of values that could hypthetically be the pop mean
Point Estimate
Selecting 1 value to hypothetically be pop mean