Chapter 3 Flashcards
Define statistics
gathering, analysing and interpreting data in order to obtain the maximum
quantity of useful information.
• Concerned with decision making under uncertainty;
Two difficulties in making correct inferences about sampled population:
- how to ensure a representative sample; and
* how to extract valid conclusions from said representative sample
sample vs population of interest
Sample - collection of units especially selected to represent a larger population
Population of interest - group from which researchers try draw conclusions; subset of general population
Sampling methods:
- Simple random sampling - equal and non-zero probability of being selected
- Stratified random sample (i.e. income groups / minority groups) - target population separated into mutually exclusive, homogenous groups and then a simple random sample chosen from each group.
- Choice based sampling (based on a previous choice made i.e. mode choice) (may not be random thus risk of bias)
List 5 issues within SA regarding data collection:
- Digital literacy and access
- Language barriers
- Technology (tablets to collect data, drones / videos)
- Actual addresses / none registered population
- Private vehicles / Minibus taxi operators (illegal / non‐registered)
- Gated communities
Differentiate between sampling error and bias:
Sampling error
• Function of sample size
• Due to only having a sample and not the population. Always present. Variability (deviation) could be different from population and sample, which influences confidence.
Sampling bias
• Due to incorrect definition of the population (i.e. youth and acceptance of pay as you drive insurance) , sampling method (random, but ignores gated estates)
• Distorts actual parameters
• Can be alleviated by better planning of data collection techniques and defining your sample
Why is sample size a trade-off?
trade‐off between too large / too small sample: Cost, time, accuracy implications
List some practical constraints in data collection:
• Length of study (i.e. transport time of year makes very big difference)
• Study horizon (Type of planning? Tactical (short term) or strategic planning with long study
horizon
• Limits of study area
• i.e. Stellenbosch municipality / broader Stellenbosch area – what will be the border in 20 years time?
• Study resources
• Issues with map reading in ZA
What is a data collection plan
very important, costly and time consuming exercise
• information on recruiting, training, questionnaire design, supervision and quality control.
What are some ideal dataset characteristics?
- Inclusion of all modes of travel, including non‐motorised trips.
- Trip purposes at disaggregate level
- Coverage of the broadest possible time period
- Data from all members of the household.
- High‐quality information robust enough to be used even at a disaggregate level
List and explain some types of surveys
- Household surveys - Main input to classic four‐step modelling approach; Provides personal / individual characteristics of individuals and households; Provides travel information (travel time, travel cost, O‐D); Perceptions of transport
- Intercept surveys / external cordon surveys - People crossing a border (PT boardings / alightings)
- Intercept surveys / internal cordon / screenline surveys - People crossing area / road / railway within the study area
- Traffic and person counts: low cost; used for calibration, validation and for further checks to other surveys.
- Travel time surveys: for calibration and validation. Private & public transport (use GPS tracking)
Level of detail of network and zoning systems are a trade of between accuracy and cost, details depend on:
- the schemes to be tested,
- the type of behavioural variables to be included,
- the treatment of time, etc
List some zoning criteria
- Zoning size ‐ if too big, then assume activities / population in centroid?.
- Compatible with other administrative divisions, particularly with census
- homogeneous as possible in their land use and/or population composition; census zones with clear differences in this respect (i.e. residential sectors with vastly different income levels) should not be aggregated, even if they are very small.
- Zone boundaries must be compatible
- The shape of the zones should allow an easy determination of their centroid connectors;