Statistics Flashcards
Define population.
The whole set of items that are of interest.
Define a census.
A study that measures or observed every member of a population.
Define a sample.
A selection of observations taken from a subset of the whole population which is used to find out information about the population as a whole.
Define sampling units.
Individual units of a population.
Define sampling frame.
A ordered list of sampling units (e.g. a list of people).
What are the two general types of sampling?
- Random sampling
* Non-random sampling
What are the three types of random sampling?
- Simple random sampling
- Systematic sampling
- Stratified sampling
What is simple random sampling?
- Sampling frame is made
* A random selection of sampling units is made
What is systematic sampling?
- Sampling frame is made
* The required number of elements are taken at regular intervals
What is stratified sampling?
- Population is divided into mutually exclusive strata
- Random sample is taken within each
- The number of samples taken in each stratum should be proportional to its representation in the general population
What are the two types of non-random sampling?
- Quota sampling
* Opportunity sampling
What is quota sampling?
- Population is divided into groups according to a given characteristic
- The size of each group determines the proportion of the sample that should have that characteristic
- As you meet people, they are assessed and allocated into the appropriate quota
- This is done until all the quotas have been filled
What is opportunity sampling?
- Taking the sample from the first people who are available at the time the study and who fit the criteria
- This is done until enough samples are taken
What is the difference between stratified sampling and quota sampling?
- Stratified sampling -> Random -> People within each stratum are selected at random
- Quota sampling -> Not random -> You do not know your sampling frame and people are not chosen at random
What is another name for opportunity sampling?
Convenience sampling
Define continuous data.
Data that can taken any value within a given range.
Define discrete data.
Data that can only take on certain specific values within a range.
What dates does the large data set contain data from?
- May to October 1987
* May to October 2015
What weather stations are used in the large data set?
- Leuchars
- Leeming
- Heathrow
- Hurn
- Cambourne
- Jacksonville
- Beijing
- Perth
Name the UK weather stations in the large data, starting from the north and going clockwise.
- Leuchars
- Leeming
- Heathrow
- Hurn
- Cambourne
Name the only weather station in the large data set that is in the Southern hemisphere.
Perth
In the large data set, what is the daily mean temperature and what are the units?
- The average of the hourly temperatures during a 24-hour period
- °C
In the large data set, what is the daily total rainfall and what are the units?
- The total precipitation including solid precipitation, like snow and hail
- Amounts less than 0.05mm are recorded as trace.
- mm
In the large data set, what is the daily total sunshine and what are the units?
- The total sunshine time
- Recorded to the nearest tenth of an hour
- hrs
In the large data set, what is the daily mean wind direction and windspeed and what are the units?
- The average wind speed over 24 hours
- Knots (kn)
- Direction is given as a bearing and compass directions
- Windspeed is also categories affording to the Beaufort scale p
On what scale is wind speed measured?
Beaufort scale
In the large data set, what is the daily maximum gust and what are the units?
- The highest instantaneous windspeed recorded
- The direction it is blowing from is also recorded
- Knots (kn)
In the large data set, what is the daily maximum relative humidity and what are the units?
- The air saturation with water vapour
* Given as a percentage
In the large data set, what is the daily mean cloud cover and what are the units?
- The mean cover of the sky with clouds
* Oktas (or eighths of the sky covered)
In the large data set, what is the daily mean visibility and what are the units?
- The greatest horizontal distance at which an object can be seen in daylight
- Decametres (Dm)
In the large data set, what is the daily mean pressure and what are the units?
- The average air pressure for that day
* Hectopascals (hPa)
When comparing data sets, what can you comment on?
- Measure of location
- Measure of spread
Use the mean and standard deviation OR median and IQR. But not any other combination.
When a data set contains extreme values and it needs to be compared, is it more appropriate to use the mean and standard deviation or median and IQR?
Median and IQR
What are some other names for the independent and dependent variables?
- Independent -> Explanatory variable
* Dependent -> Response variable
What is bivariate data?
Data which has pairs of values for two variables.
What is a sample space?
The set of all outcomes in an experiment or the set of all values that a random variable can take on.
What is the term for when all of the probabilities in a sample space for a variable are the same?
Discrete uniform distribution
In binomial distributions, what is n sometimes called?
The index
In binomial distributions, what is p sometimes called?
The parameter
What is the way of writing binomial distributions, and what does each letter stand for?
X~B(n,p)
P(X=r)
Where: • X = Variable • B shows it is binomial • n = Number of trials • p = Probability of success • r = Number of successful trials
What is a cumulative probability function?
One that tells you the sum of all the individual probabilities up to and including the given value of x in the calculation.