Data Analysis Flashcards

1
Q

Where data is complied from

A

Data used in transport modelling is compiled from samples of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sampling Methods

A
  • Simple Random Sampling

- Stratified Random Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Simple Random Sampling

A

Involved associating an identifier (number) to each unit in population, then selection numbers at random to obtain the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Stratified Random Sampling

A

Population subdivided into homogeneous strata and then random samples taken from each of these groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Problem with simple random sampling

A

Far too large sample would be required to ensure sufficient data collected on minority groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Types of errors that can be introduced in sampling

A
  • Sampling Error

- Sampling Bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sampling Error

A

Error generated due to fact that sample is only proportion of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sampling Bias

A

Caused by mistakes made either

  1. when defining population of interest
  2. when selecting sample method
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Equations

A

In lecture slide 6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Type of errors

A
  • Errors in modelling and forecasting
  • measurement errors
  • sampling errors
  • specification errors
  • transfer errors
  • aggregation errors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Errors in modelling and forecasting

A

ideal req is to find combo of model complexity and data accuracy which best fits required forecasting precision + study budget

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

measurement errors

A

survey questions badly interpreted, answered badly, coding errors, etc, can cause these

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

sampling errors

A

due to representation of population by finite data sets

equation in lecture 6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

specification errors

A

arise where phenomenon being modeled is not well understood, eg. irrelevant variable included in model or relevant variable is omitted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

transfer errors

A

arise if model is removed from one area to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

aggregation errors

A

typically in models, forecasting done for groups of individuals but data is compiled on basis of responses of individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

type of info required by surveys

A
  • infrastructure eg. road network, public transport network
  • land use inventory eg. residential zones
  • O-D travel surveys eg. traffic counts
  • Socio-economic info eg. income, car ownership
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

questionnaire design

A
  • keep qs simple + direct

- divide into several sections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

roadside interviews

A

-better method of estimating trip matrices than home interviews as larger samples available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

cordon surveys

A

provide useful info about external-external and external-internal trips

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

screen-line surveys

A

divide area into large natural zones eg. at both sides of river of motorway

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

travel diary surveys

A
  • require similar but more detail to that of an O-D survey

- diaries distributed to members in a HH and each asked to complete diaries for all travel during day

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

stated preference surveys

A

where travelers evaluate and rank set of hypothetical options

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

longitudinal/time series collection metods

A
  • repeated cross sectional survey
  • similar measurements conducted on samples at diff times
  • individuals may be included in more than one survey
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

panel survey

A

similar measurements made on same sample at diff times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

cohort survey

A

some individuals included for only proportion of survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

problems

A
  • panel surveys become unrepresentative as individuals age
  • may omit phenomena eg. children leaving home
  • typically higher rate of non-response
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Accuracy

A

Overall estimate of errors present in measurements, including systematic effects. Set of observations considered accurate if mean of observations close to that of true value

29
Q

Precision

A

represents repeatability of a measurement + is concerned only with random errors. Good precision is obtained from a set of observations closely grouped together with small deviations from mean of observations. A set of observations spread out widely have poor precision

30
Q

Mean

A

Sum of all data points divided by number of data points

31
Q

Standard deviation

A

measure of spread or dispersion of set of measurements. If small, measurements have good precision.

32
Q

Standard deviation equation

A

lecture slides 4&5

33
Q

Standard error of mean (SEM)

A

Standard deviation of mean.

Estimates variability between samples whereas standard deviation measures variability within a single sample

34
Q

Differences between standard deviation and standard error of mean

A
  • SD quantifies scatter: how much values vary from one another
  • SEM quantifies how precisely you know true mean of population
  • SEM, by definition, always smaller than SD
  • SEM gets smaller as samples get larger, as mean of large sample is likely to be closer to true population mean than mean of small sample
35
Q

Range

A

difference between lowest and highest values in dataset

36
Q

Quartiles

A

where dataset is segmented into four equal segments

37
Q

outliers

A

data point in data set that is much larger or smaller than all of the other data points in data set

38
Q

what outliers can do

A
  • skew mean, standard deviation, standard error
  • can provide incorrect result
  • can indicate incorrect data and point to a problem in data collection process
39
Q

methods for checking for outliers

A
  • plot data

- descriptive analysis (average, range, standard error, quartiles)

40
Q

importance of transport planning

A
  • crucial in planning sustainable developments + ensuring accessibility for all individuals
  • design phase of all major public amenities require significant transport planning
  • at planning stage of following amenities it is important: sporting venues (stadiums), retail parks, shopping centers, residential areas, industrial parks/commercial centers.
41
Q

Transport Planning

A
  • justify funding
  • obtain planning permission
  • environmental considerations
42
Q

justify funding

A

detailed plan of how road/service will impact population needs to be conducted in justifying expenditure on new road/public transport service

43
Q

obtain planning permissions

A

traffic impact assessment and transportation plan for new site important when large development being planned. These plans included in application for planning permission

44
Q

environmental considerations

A

environmental considerations should be taken into account

45
Q

Sustainable development

A

a socio-ecological process characterized by fulfillment of human needs while maintaining quality of natural environment indefinitely

46
Q

key element in sustainable transport planning

A

-minimize distance individuals have to travel, and if longer distance travel necessary that good public transport links provided

47
Q

CO2 emissions statistics

A
  • Road transport accounts 21% of Irish CO2 emissions
  • Road traffic rising 2% per year
  • Global aviation growing at 5% per year
48
Q

methods of transport planning

A
  • transport impact assessment (TIA)

- traffic forecasting

49
Q

transport impact analysis/assessment

A

study which assesses effects a particular development’s traffic will have on transportation network in community

50
Q

traffic impact studies help communities to

A
  • forecast additional traffic associated w/ new development
  • determine improvements necessary to accommodate new dev
  • assist in land use decision making
  • assist allocating scarce resources to areas which need improvements
  • identify potential problems w/ proposed development which may influence developer’s decision to pursue it
  • allow community to assess impacts proposed development may have
51
Q

why traffic forecasting is important

A
  • plan future transport needs
  • plan for congestion
  • measure maintenance needed on road network
  • plan for new large developments
52
Q

what is traffic forecasting estimated on?

A
  • population + job forecasts
  • car ownership forecasts
  • travel demand forecasts
  • good vehicles forecasts
53
Q

capacity of a road

A

max flow of vehicles, per hour or per day, for a road

54
Q

types of data

A
  • large scale data

- In-dept behaviour data

55
Q

large scale data

A

lots of observations, but little info for each

eg. census travel to work/education, Irish Rail Census data

56
Q

in-depth behaviour data

A

fewer observations, more detail for each

eg. trips Trinity students make during a college week

57
Q

transport survey constraints

A
  • can’t collect all the data in all the detail you want
  • travel behavior tends to be complex
  • data costs money, more in-depth data costs more money
  • privacy and data protection issues
58
Q

3 types of transport surveys

A

travel diaries
detection apps
survey

59
Q

travel diary considerations

A
  • sample considerations, who should take part
  • can’t get everyone in college to take part, unrealistic, need diff students to get a good reflection of all students
  • should try to be as representative as possible of overall population
  • need a lot of info over prolonged timescale
  • need easy way to capture, store, analyse data
60
Q

travel diary

A

diary where people record what trips they took, how they traveled, how long it took, why they traveled, what mode they took, etc

61
Q

travel diary advantages

A
  • tend to be simple + easy to interpret

- dont require large amounts of digital literacy

62
Q

-travel diary disadvantages

A
  • participants may forget to input info or put it in later
  • estimates may not be accurate (travel time, distance, etc)
  • not able to gain more complex data (routes taken, modes available, etc)
63
Q

detection apps

A

smartphone applications that automatically record trips

64
Q

gps apps advantages

A
  • huge data collection potential
  • automatic detection
  • graphic + route specific outputs (maps etc)
65
Q

gps apps disadvantages

A
  • not everyone has smartphone (65+ etc)

- issues such as battery use + canyon effects

66
Q

transport surveys

A

widely used to gain info about how people act/will act

67
Q

transport surveys advantages

A
  • can get large no. of responses
  • can present hypothetical scenarios
  • relatively cheap to do
  • can ask large no. of questions + get large no. of info
68
Q

transport surveys disadvantages

A
  • non-representative samples can bias results
  • have to assume respondents are reading all questions + answering honestly
  • have to make sure they understand what you are asking them