Collection Of Data (1) Flashcards
What is quantatative data
Numerical data
What is Qualitative data
Non numerical data
What is continuous data
Takes a value on a continuous scale (e.g.mass and length - both can be in decimals)
What is discrete data
Takes particular (whole number) values on a continuous numerical scale
What is raw data
Data that is just as collected - not grouped or rounded
What is categorical data
Data which can be sorted into non overlapping catergories
What is ordinal data
Data that can be written in order or given a numerical rating scale
What is bivariate data
Involving pairs of related data
What is multivariate data
Involves sets of three or more related data values
What are class intervals
Groups that do not overlap in which data can be grouped.
1-10
11-20
21-30
Intervals do not need to be equal widths, use narrower intervals when the data is close together and wider intervals when the data is spread out
Why do you need to be careful choosing intervals
The data trends can be obscured
What are the class intervals like in continuous data
The class intervals must not have gaps
Say the interval was 0-5 6-10 where would 5.5 go.
For continuous data use < and <=
E.g
0<t<=10
10<t<= 20
When given a rounded number what are the possible decimals values of that number, between
E.g 230
229.5<=x<230.5
A measurement given correct to the nearest whole unit can be inaccurate to 0.5
What is primary data
Data collected by or for the person who is going to use it
What is secondary data
Data collected by someone else
What is the population
Everything / everybody in a group that you are investigating
What is a census
A survey / investigation with data taken from every member of a population
(The national census is data taken from every member of the uk)
What is a sample
Information about part of a population - to avoid bias it should represent the characteristics of the population
Avantages and disadvantages of primary data
Advantages
The collection method is known
Accuracy is known
Can find answers to very specific questions
Disadvanyage
Time consiming to collect
Expensive to collect
Avantages and disadvantages of secondary data
Advantages
Easy to obtain
Cheap to obtain
Data from some organisations (e.g office of national statistics) is more reliable
Disadvantages
Method of data collection unknown
Data might be out of date
May contain mistakes
May come from an unreliable source
May be difficult to find answers to specific questions
Advantages and disadvantages of a Census
Advantages
Unbiased
Accurate
Takes whole population into account
Disadvantages
Tine-consuming
Expensive
Difficult to ensure the whole population is used
Lots of data to handle
Advantages and disadvantages of a sample
Advantages
Cheaper
Less time consuming
Less data to be considered
Disadvantages
Not completely representative
May be biased
What are sampling units
The people or items that will be sampled
What is the sampling frames
List of all sampling units
What is the Petersen capture recapture formula
N=Mn/m
N= the population
M= members of the population tagged / marked
n = new capture size (after time waited)
m= The number that are marked - from that capture size
What is the Petersen capture recapture formula used for
Estimating the size of a population
E.g small insect populations that would be impossible to count
What assumptions do we make when using the capture recapture formula
The population hasnt changed (no members have left the population or joined)(born/died)
The probability of being caught is equal for all individuals
The marks/tags are not lost and are easily recogniseable
The sample size is large enough to represent the population
What is a random sample
A method of sampling where every member of a population has an equal chance of being included
It is unbiased
How can we take a random sample
Number each piece of data then select the numbered items by:
Using a random number table
Using a random number generator/calculator
Using a computer or apl to generate random numbers
Putting the numbers in a hat and taking them out
Rolling fair sets of 10-sided dice
Advantages of random sampling
More likely to represent the population (if large enough)
Choice of members of sample is unbiased
Disadvantages of random sampling
Needs a full list of the whole population
Needs a large sample size
What is judgement sampling
Use your judgement to select a sample representative of the population
What is opportunity sampling
Using people or objects available at the time
What is cluster sampling
Use when data naturally splits into groups
(E.g geographical areas)
The list of clusters is the sampling frame and some clusters are randomly selected from the list to make the sample
What is systematic sampling
Choose a starting point in your sampling frame at random
Then choose items at regular intervals - e.g every 10th person (after the one you choose)(random number generators can be used to choose your starting position)
What is quota sampling
Group the population by characteristics
Interview a quota (number) from each group
What is stratified sampling
Putting the members of each stratum (group) in proportion to the size of that stratum
The sample is selected randomly
E.g
Year. 7. 8. 9. 10. 11
Students 250. 250. 200. 150. 150.
Total students = 1000
Divide each group (strata) by the total then multiply by the sample size you want (e.g 60)
What is the explanatory / independent variable
The variable that isnt change by others
It is the variable you change
What is the response / dependent variable
The variable that you are measuring
What are extraneous / control variables
The variables you control in an experiment
What type of experiments are used to collect data
Laboratory experiments - conducted in a controlled environment (not necessarily a laboratory).
Field experiments - Experiments carried out in a test subjects everyday environment. A researcher sets up the situation with some control variables
Natural experiments - Carried out in the test subjects everyday environment, the researcher has no control over the variables
What are the advantages and disadvantages of laboratory experiments
A:
Easy to replicate
You can control extraneous variables
D:
Test subjects may behave differently (in test conditions) to how they would normally
What are the advantages and disadvantages of field experiments
A:
More likely to reflect real life behaviour
D:
Cant control extraneous variables
What are the advantages and disadvantages of natural experiments
A:
Most likely to effect real life behaviour
D:
Cant control any variables
Hard to replicate
How can you show your data is valid / reliable
By repeating it and getting similar data values
How can simulation be used
It is used to model random real life events, to help predict what could actually happen
It is easier and cheaper than collecting / analysing real data.
What is a questionaire
A set of questions designed to obtain data
What is a respondent
The person completing a questionnaire
What is an open question
A question with no suggested answers
What is a closed question
A closed question has a set of answers
What is one problem with open questions
Every respondent could give a different answer making data analysis harder
How should you structure a questionnaire
Keep answers short with simple language
Avoid biased or leading questions (questions that suggest an answer)
Give intervals that do not overlap (1-10 11-20)
Make sure options cover all possibilities (never / i dont know / 0)
Include a time frame in the question
Avoid questions that will be answered not honestly
Advantages and disadvantages of using interviews to collect primary data
A:
Interviewers can explain questions.
Interviewers can put people at ease when answering personal questions.
Respondents can explain answers.
High response rate.
D:
May be less honest / less likely to answer personal questions
Can take a long time / more expensive
Smaller sample size
Interviewers may have bias
Respondent may try to impress the interviewer (choose the ‘right’ answers
Advantages and disadvantages of using annonymas questionaires to collect primary data
A:
Respondents more likely to answer truthfully / answer personal questions.
Questionaire are quick / cheap (all respondents can complete them at the same time)
Easy to send questionairs to a large representative sample
No interviewer bias
D:
Respondents may not understand questions
Researchers may not understand answers
Lower response rate - some people may never answer
How can you remove bias from your response rate
By using a random response method (uses a random response)
E.g flipping a coin.
If heads answer yes
If tails answer truthfully
The survey results can then be used to calculate an estimate of the proportion who answers yes (when landing on tails)
Estimate the proportion of people who have shop lifted
Flip a coin - heads answer yes
Tails - answer truthfully
820 answered yes
730 answered No
Find total population
820 + 730 = 1550
Estimate the number of heads (50% chance for heads or tails)
0.5 × 1550 = 775
Estimate for truthful yes answers :
Subtract estimated heads from number who said yes
820-775 = 45
Estimate proportion of people who have shopplifted
(Divide truthfull yes answerd by number who answered truthfully)
45/775 = 0.05808
What is a control group
This group has no changes to it
E.g give one group treatment and see how they improve, give another group no treatment and see how they improve
What is a matched pair
Two people from two different groups in your test.
These people have everything in common except the factor being studied
What is a hypothesis
An idea you test by collecting information and analysing data
What 8 factors do you need to consider when planning an investigation
Time - how long will the investigation last / take to set up
Cost - how much will it cost to set up / carry out the investigation
Ethical issues - no participant should be harmed / respect peoples rights
Convenience - can you easily get the data locally
How to select your population / sample - identify the population you are interested in
How to deal with non response - how many responses do you need? (Questionnaires send out more than they need back)
How to deal with unexpected results - what do you do about anomalies
Advantages / disadvantages of judgement sampling
A:
Its cheap and convenient
Requires little planning
D:
May not be representative
Can lead to skewed data
Heavily biased
Advantages / disadvantages of quota sampling
A:
Cheaper as less respondents are required
Diverse data from multiple groups of a population (e.g different ages)
D:
There can be bias in the selection process
Advantages / disadvantages of stratafied sampling
A: allows for more accurate unbiased data (you do not select what you are sampling)
Allows you to collect more diverse data (from multiple populations)
D:
The selection of appropriate strata may be complicated
Requires more planning and effort to set up compared to others
Advantages / disadvantages of cluster sampling
A: significantly easier and more time efficient than other methods
D: an individual cluster (group) tend to have similar ideas, which may cause your answer to be inaccurate or biased
Advantages / disadvantages of opportunity sampling
A: it is easy and convenient, requiring almost no planning time
Due to its simplicity data can be collected quickly
D:
As you are choosing who to interview there will be a large amount of bias
The sample might not be representative of the population due to the people you are choosing
Advantages / disadvantages of systematic sampling
A:
Simple and quick
Samples are evenly distributed
Less opportunity for manipulated data
D:
Possibility of unequal selection
Risk of bias
Requires the whole population size
What is a pilot study
A small scale study, conducted to evaluate the cost-effectiveness, duration and feasibility of your study before starting your full scale research product
What are the advantages and disadvantages of a pilot study
A:
Helps you resolve any problems in your study, E.g some people may not understand a question
D:
Time consuming, and can be expensive (reduces the amount you can spend on your actual study)