Statistics Flashcards
Define population
The whole set of items that are of interest
Define sample
A subset of the whole population that is intended to represent the whole population
Define sampling unit
Each individual item within the population that can be sampled
Define sampling frame
A list that is formed when the individual sampling units are named or numbered
Define qualitative data
Non-numerical data values .e.g. colour or model of a car
Define quantitative data
Numerical data, it can be discrete or continuous
What is the difference between continuous and discrete data?
Continuous data is data that can be measured and expressed as a decimal values where as discrete data is data that is counted and only expressed as integers
Give an example of discrete data
Number of people or shoe size
Give an example of continuous data
Height
Describe the process of random sampling
Items in a sample are numbered to form a sampling frame and then a random number generator or other random process is used to select a sampling unit
What are the advantages of random sampling?
Cheap, bias free, easy, each item has an equal chance of being selected
What are the disadvantages or random sampling?
Not suitable for large data sets, requires a sampling frame
Describe the process of systematic sampling
Items are chosen at regular intervals through the sampling frame. Intervals of k where k = Population Size (N) / Sample Size (n). Starts at a random number between 1 and k.
What are the advantages of systematic sampling?
Simple, quick, suitable for large populations
What are the disadvantages of systematic sampling?
Requires sampling frame, bias can be introduced if sampling frame is not ordered randomly
Describe the process of stratified sampling
The population is divided up into groups (strata) which are meant to represent different qualities within the population. A simple random sample is carried out on each strata. The size of the sample is decided using the equation: Sample Size (n) / Population Size (N)
What are the advantages of stratified sampling?
Reflects population size and guarantees proportional representation of all groups within the population
What are the disadvantages of stratified sampling?
Introduces bias, requires sampling frame, population must be clearly divisible into strata
Describe the process of quota sampling
The population is divided into groups according to characteristics and then a quota of items from each group is created and the interviewer will chose the sampling units
What are the advantages of quota sampling?
Allows small sample that is representative of the whole population, quick, easy, inexpensive, no sampling
What are the disadvantages of quota sampling?
Bias, population must clearly divide into groups, increasing scope means increasing groups, non-responses are not recorded
Describe the process of opportunity sampling
A sample is taken from any sampling unit that is available at the time of the study and meets the criteria
What are the advantages of opportunity sampling?
Easy and inexpensive
What are the disadvantages of opportunity sampling?
Bias, dependent on researcher, will not provide equal representation for the whole population
What equation is used for the median of a set?
n/2
What is interpolation?
A method used to determine the exact position of a value in grouped frequency
What should you do if the median or the mean is a fraction?
Round up
What does the standard deviation represent?
The average distance from the mean
What is linear coding?
The process of applying transformations to a set of data in order to calculate the mean or the variance easier
Define outlier
A measurement that does not fit the overall pattern of the data
Which bound is plotted in a cumulative frequency diagram?
The upper bound
Define extrapolation
Predicting a data point outside the range
What does skew show?
The spread of data within a range
What does positive skew suggest?
There is a higher proportion of data towards the bottom of the range
What does negative skew suggest?
There is a higher proportion of data towards the top of the range
What two things are necessary for comparing data?
A measurement of spread and a measurement of location
Give an example of a measurement of spread
Range, IQR, quartiles or standard deviation
Give an example of a measurement of location
Median, mean or mode
Define interpolation
Predicting the location of a value within the range
What is data with two variables called?
Bivariate data
Define experiment
A repeatable process that leads to a number of outcomes
Define event
An outcome or a set of outcomes from one instance of an experiment
Define sample set
The whole set of possible outcomes
What does it mean when two events are mutually exclusive?
They have no outcomes in common
What does it mean when two events are independent?
They have no effect on one another
What does ‘X’ represent?
A random variable that represents a single experiment or trial
What does P(X = x) mean?
The probability of X is equal to x
What does ‘x’ represent?
The probability of a single outcome
Define critical region
A region of the probability distribution which, if the statistic falls within it, the null hypothesis will be rejected