Official Statistics and Big Data Flashcards
What is secondary analysis?
The analysis of data gathered by someone else, for their own purposes, by researchers who were not involved in its collection
What are the advantages of secondary analysis?
- Saves money and time
- Offers high quality data
- Gives opportunity for longitudinal analysis
- Allows subgroup/ subset analysis
- Opportunity for cross-cultural studies
- More time for data analysis
- Enables application of recent theory to old data
- Gets more value from original data
What are the disadvantages of secondary analysis?
- Need to become more familiar with how the data was collected/ coded/ managed
- Data can be very large and complex
- Data quality should not be taken for granted
- Variables important for analysis may be missing
What advantages do official stats have over other forms of quan data such as survey data?
- Data has already been collected so saves time and expenses
- The people who source the data aren’t being asked questions that are part of a research project so its unobtrusive, fewer issues with reactivity
- There is the prospect of analysing the data both cross-sectionally and longitudinally
Are official statistics neutral?
No, the data does not just sit there and wait to be collected in that particular way, instead, conscious social actors decide what data is needed, what data to collected and how to do so
Why has the use and analysis of official statistics been so controversial?
Due to the lack of objectivity and socially constructed research
- no stats can be considered ‘objective facts’ as they are all paid for and collected by an organisation with a purpose
What are the issues with crime and deviance statistics?
- Crimes have to go through a vigorous process before they can be recorded by the police
- Where police officers are put will affect which crimes are caught
- Reliability of official statistics often jeopardised because definitions and policies regarding the phenomena to be counted vary over time
- Reliability (and validity) can of course be compromised due to ‘fiddling’ of stats by police office
What is counted under the crime recording rules?
- Serious not minor crimes
- Real issue with classification - if something has occurred how do we know what category it is in?
- How we count certain crimes - per victim, per offender, per crime?
Why do changes in offence totals not necessarily represent changes in criminal behaviour?
- Policing priorities
- Police recording - notifiable offence list
- Rules on recording and counting
- Crime reporting behaviour of the public - campaigns to encourage reporting
What are the political and ethical issues with crime and deviance statistics?
Police are collecting the data - this is difficult, time consuming and they are under pressure to ‘cook the books’.
What type of statistic are police stats?
In 2014 they were downgraded from a ‘national statistic’ to an ‘official statistic’. These stats are increasingly seen as ‘unfit for purpose’
What is big data?
Usually refers to extremely large sources of data that are not immediately amenable to conventional ways of handling them, often focusing on social media but also consumer behaviour.
- considered non-reactive and with potential that is not yet utilised
- there are data, process and management challenges
What are the issues with data designs and processes being hidden from data users? (particularly in official forms of data)
We must ask who or what is excluded from data sources, national surveys often have incomplete population coverage due to the systematic exclusion of certain ‘non-households’
- this is important as the excluded groups are often the people who are already marginalised within society (by policy design)
- some countries may choose to purposefully exclude marginalised groups for political reasons
How can ‘non-household’ groups be included in data?
- Bespoke surveying, implies high costs and gatekeeper access
- ‘piggyback’ onto existing data collection infrastructure by adding new questions
- Combining multiple data sources
- Using action based sampling in which the focus is surveying people when they engage with key services
- Retrospective data collection, where people who have previously been classified as a non-household are surveyed as if they were still part of that non-household population
When we have created data, how can we ensure we analyse it responsibly?
The overall aim is to be true to original data by using appropriate analysis techniques - not manipulating the data or misrepresenting peoples views