Data Linkage and Thematic Mapping Flashcards
Name 4 sources of geographic data
− Some free with GIS software or internet
− Data provision companies (expensive!)
− National services for Higher education and Research
− UKBORDERS: ways of extracting and downloading boundary datasets
About the census
− Cross-sectional snapshot of population on single date
− Source of secondary data (Official Statistics)
− Can be examined at several geographical levels
− Total (nearly!) enumeration (count) of national population
− Compulsory completion, require ‘willing co-operation’
− Coverage is consistent (all households asked same questions)
− Coverage comprehensive (enumerators try very hard to ensure completion/response)
− Decennial census since 1801 (2011 Census = 21st and last!), no Census 1941 (1966: mid-term sample Census)
− Confidentiality is protected, criminal offence to disclose information about identified individuals
− Released in ‘super output areas’ = 300 people.
− Costs a lot (£480m)
Name 5 uses of the census
− Info about population, housing, employment, education
− Re-calibration of mid-year population estimates
− Inform more than £50 billion of local authority expenditure per annum
− Business can assess markets and locate new offices
− Useful source for student dissertations/projects
How can you access the census?
− CASWEB- digital boundary data with attribute data for 1991 and 2001
− InFuse- select a particular topic and then filter out what is required (2001 and 2011)
What is Metadata? And the 4 things it shows
Descriptive information about data (Content, Quality, Condition, Origin)
Why do we need metadata?
It is important for the integrity of any analysis
What is Mapping?
Linking of geographic and attribute data
Name the 3 types of Thematic Mapping
Chloropleth, Proportional Symbols, Dot maps
Objective of Chloropleth Maps
Symbolise magnitude of statistics for areas
Key points about chloropleth maps
− Use darker shading for higher values and light shading for lower values
− Sliding scale (dark > light)
− Avoid shades that are too similar
− Bright colours often interpreted as higher values
− Avoid solid white shading = no data
When deciding on class Intervals… 4
− Need to specify number of classes and class interval used (5 usually maximum, 4 is common) Equal count − The same number of records/areas/zones in each class Natural break − Sometimes data is clustered into distinct groups so the GIS will assign classes based on these clusters Standard deviation − Middle range breaks at mean of data and then each class is on SD above or below. Quartiles/ Quintiles where the class breaks are made for each 20% of people unemployed.
Problems of Chloropleth Maps
− Some zones (e.g. wards and output areas) are not natural areas for representing socioeconomic conditions, but we have limited choice.
− Choice of areal units is arbitrary and modifiable
− Information is lost when data are aggregated
− MAUP and Ecological fallacy