Collecting, presenting and summarising data Flashcards
Random variables (definition)
The quantities measured in a study
Data (definition)
A collection of such observations
Observation (definition)
A particular outcome
Population (definition)
The collection of all possible outcomes
Example
Study on the height of students of A&F course at Newcastle
What would be the random variable?
What would a value of Joe Blogs measured height be called?
What would be our data?
This would be a _______ from the ____________ which consists of all students registered on A&F degrees
- Our random variable is “the height of students on A&F courses at Newcastle”.
- If Joe Bloggs is an A&F student, and we measured his height, then that value would be a single observation.
- If we measured the height of every first year A&F student, we would have a collection of such observations which would be our data.
- This would be a sample from the population which consists of all students registered on A&F degrees
Ideally, to get a true idea of what is going on, we’d like to observe the whole population (take a _______). However, this can be difficult:
Why would it be difficult?
A census
- If the population is huge, then this would take ages!
- And it would be very costly!
- In reality, we usually observe a subset of the population… but how do we choose who to observe?
Quantitative variables (2 types and explanation)
Discrete random variables
- can only take a sequence of distinct values (usually integers);
- are usually countable - e.g. the number of people attending a tutorial group;
- can be ordinal - where the outcomes are ordered.
Continuous random variables
- can take any value over some continuous scale - e.g. height or weight.
- can be measured to a very high degree of accuracy (provided we have the equipment to do so) (often decimals)
- however, we can never say precisely how much someone weighs, for example,
might be measured to the nearest whole number - and so could “look” discrete - be careful!
Sampling
What is a sample?
What is the difficulty?
What is a biased sample?
- Subset of the whole population
- Obtaining a representative sample
- Unrepresentative and unfair
What are the general forms of sampling techniques?
- Random sampling - where the members of the sample are chosen by some random (i.e. unpredictable) mechanism.
- Quasi-random sampling - where the mechanism for choosing the sample is only partly random.
- Non-random sampling - where the sample is specifically selected rather than randomly selected.
Simple Random Sampling disadvantages
- We don’t have a complete list of the population
- Not all elements, of the population are equally accessible
- By chance, you could pick an unrepresentative sample
Stratified sampling
What is it?
What is its main idea?
- Form of random sample where clearly defined groups or strata exist within the population
- If we know the overall proportion of the population that falls into each of these groups, we can take a simple random sample from each f the groups and then adjust the results according to the known proportions
Systematic sampling
What is it a form of?
Example?
Disadvantage?
- Form of quasi-random sampling
- For example picking every 10th item to come off the production line
- Not entirely random and can be biased
Multi–stage Sampling
What is it a form of?
When is it common?
How does it work?
Example?
Advantage?
Disadvantage?
- This is another form of quasi–random sampling.
- These types of sampling schemes are common where the population is spread over a wide geographic area which might be
difficult or expensive to sample from. - Multi–stage sampling works, for example, by dividing the area into geographically distinct smaller areas, randomly selecting one (or more) of these areas and then sampling, whether by random, stratified or systematic sampling schemes within these areas.
- For example, if we were interested in sampling school children, we might take a random (or stratified) sample of education authorities, then, within each selected authority, a random (or stratified) sample of schools, then, within each selected school, a random (or stratified) sample of pupils.
- This is likely to save time and cost less than sampling from the whole population.
- The sample can be biased if the stages are not carefully selected. Indeed, the whole scheme needs to be carefully thought through and designed to be truly representative.
Cluster Sampling
What is it?
What does it differ from?
Advantage?
Disadvantage?
Example?
- This is a method of non–random sampling. For example, a geographic area is
sub–divided into clusters and all the members of a particular cluster are then surveyed.
This differs from multi–stage sampling covered in Section 3.2.4 where the members of the cluster were sampled randomly. Here, no random sampling occurs.
- The advantage of this method is that,
because the sampling takes place in a concentrated area, it is relatively inexpensive to perform. - The very fact that small clusters are picked to allow an entire cluster to be surveyed introduces the strong possibility of bias within the sample. If you were interested in the take up of organic foods and were sampling via the cluster method you could easily get biased results;
- if, for example, you picked an economically deprived area, the proportion of those surveyed that ate organically might be very low, while if you picked a middle class suburb the proportion is likely to be higher than the overall population
Judgemental sampling
What is it?
Advantage?
Example?
Disadvantage?
- Here, the person interested in obtaining the data decides whom they are going to ask.
- This can provide a coherent and focused sample by choosing people with experience
and relevant knowledge to provide their opinions. - For example, the head of a service
department might suggest particular clients to survey based on his judgement. They
might be people he believes will be honest or have strong opinions. - This methodology is non–random and relies on the judgement of the person making the choice. Hence, it cannot be guaranteed to be representative. It is prone to bias