Strand 2: Data Quality and Consistency Issues Flashcards
All data sets have what problems?
What are the quality issues with Attribute data? i.e Census
- Comprehensive but only every 10 years
- Under-enumeration
- Misreporting of information
- Methods used to avoid risk of disclosure
Why is it important that researchers are aware of the quality issues?
Issues take up a lot of time especially with temporal comparative analysis
What are the 7 quality issues with geographical data?
- Accuracy of the boundaries
- GPS not giving accurate readings
- When no metadata is available
- When images are scanned or digitized
- When digital data is transformed e.g Vector or Raster
- Dataset held by different organizations that don’t match
- Errors in data
What are consistency issues with geographical and attribute data?
- Consistency over time when comparing data from the same source
- Consistency between different database measuring the same variables
- Temporal Inconsistencies due to: a) changing spatial units b) changing definitions of variables c) changing classification system
Migration as a means to look at data inconsistencies.
Define Migration.
The movement of people from one place in the world to another for the purpose of taking up permanent or semi-permanent residence, usually across a political boundary
What are the components of population change?
What year did Accession Countries (A8) join the EU?
Is there an all inclusive system in place to measure moment of population into or out of the UK?
What did the ONS produce to get the best estimation of migration?
Population Estimate Unit (PEU) - produces annual international migration estimates
What 4 alternative administration data can be used to estimate migration?
- National insurance number registrations
- GP registrations
- Higher education statistics authority
- Workers registration scheme
What are the pro’s and con’s using NINo as a migration measure?
1. Comprehensive geographical coverage
2. Residential pesticide captured
3. Compulsory process (for those who work) and continuous data capture
4. Country of origin and age
1. NO length of stay info
2. No info on outflows (no-deregistrations)
3. There may be a delay between arrival and departure
4. Excludes students, dependents and not working
5. Data reflects migrant’s location at registration, not the stock of migrants nationally or where they may settle
What are the pro’s and con’s using GP Registrations as a migration measure?
Pros: 1. Full geographical data detail 2. Data on Age and Gender Cons: 1. No info on outflows 2. No info on delay between arrival and registration 3. Some migrants may not register 4. Internal migration not recorded 5. Age and Gender only data recored
What are the pro’s and con’s using HESA as a migration measure?
Pros: 1. Comprehensive coverage of international students 2. Includes all HE establishments 3. Length of stay provided 4. Continous data capture Cons: 1. Students only 2. Institution based not residence 3. Some institutions have split sites
What are the pro’s and con’s using Workers Registration Scheme as a migration measure?
Pros: 1. Continous data capture 2. Full geographical detail 3. Data on occupation, origin and age Cons: 1. A8 migrants only and voluntary registration 2. Excludes self-employed 3. Register by employer not employee residence 4. Temporary scheme? 5. No length of stay info
What are the key characteristics of administrative data?
- Narrow data capture
- Good geographical coverage
- Available from different sources
What are the temporal consistency issues with geographic and attribute data?
- Changing spatial units over time
- Changing definitions of variables between Censuses (for example)
- Changing classification systems
- Boundaries change between census, therefore comparisons hard to make especially like-for-like