Chapter 7 Flashcards
what are the two types of data that can be relied on?
Primary data
secondary data
What are features of primary data?
- Data collected by an investigator
- With a specific project or task in mind
- ‘new data’
- Tailored to requirements but time consuming
What are features of secondary data?
- Data collected by someone else for some other purpose
* E.g. using census data in a research project
What is a population?
A population is all members of a defined group e.g. all males in the UK
What is a sample?
A sample is a subset of a population
• A sample should be representative and thus sufficiently large to reflect the population
What are types of sample?
Sample types
o Random – each member of the population has an equal change of being selected
o Non-random – a selection criteria is used
What are types of data and their features?
Discrete data
• Can only take certain values e.g. the result of rolling two dice
• Discrete data is counted
Continuous data
• Can take any value (on an interval) of the line from minus infinity to plus infinity (the ‘real line’), e.g. time taken in a race
• Continuous data is measured
what are measures of central tendency
Arithmetic mean
o Simple average (add all together an then divide)
Median
o Middle item
Mode
o The most frequently occurring
what is a range?
Range = the difference between the highest and lowest values in a data set
What does the interquartile range represent?
Represents the ‘middle 50%’ of a data set
Features of the first quartile
o First quartile (‘25th percentile’)
25% of observations lie below this
Mid-point between the lowest value and the median
Features of the second quartile
o Second quartile
50% of observations lie below this
The median
Features of the third quartile
75% of observations lie below this
Mid-point between the median and the highest value
what is skew?
Skewness is a measure of the degree of asymmetry of a distribution.
in skew what happens to mode, mean and median?
skew the mode does not move and stays anchored to the highest point. The median will always be in the middle of the two. And it is the mean that is the one most effected.