LDS Flashcards
(17 cards)
how many missing or incorrect values for engine size?
One missing or incorrect value (0)
Year registered
All registered in 3rd-9th June 2002 or 6th-12th June 2016
how many missing or incorrect values are there for mass?
92 incorrect or missing values (0)
how many missing or incorrect values are there for CO2?
2 missing or incorrect values (0)
how many missing or incorrect values are there for CO?
13 missing values (blank)
how many missing or incorrect values are there for NOX?
74 missing values (blank)
how many missing or incorrect values are there for parts (particulate emissions)?
3105 missing values (blank)
how many missing or incorrect values are there for hc (hydrocarbon emissions)?
1422 missing values (blank)
How can we avoid using missing/incorrect data?
Clean the data before sampling
Summary (4 points)
- only one electric and one petrol vehicle in the whole database
- emissions data only known for around 80% of the whole database
- particular (part) emissions are only applicable to diesel cars
- sample sizes different for each year
makes of cars in the large data set
- BMW
- Ford
- Toyota
- Vauxhall
- Volkswagen
regions included in the large data set
- London
- North West
- South West
propulsion types in the large data set
- petrol
- diesel
- electric
- gas/petrol
- electric/petrol
keeper title IDs
- male
- female
- not used (this identity hasn’t been used in the large data set)
- unknown (Rev, Dr etc)
- company
units used in the large data sets
- emissions g/km
- mass kg
- engine size cm³
what does the mass include in the large data set?
- vehicle
- 75kg person
which proportion types are there only one of in the whole data set?
- petrol
- electric