Lecture 6 – Modelling Data Flashcards
In what scenarios do we encounter or need to deal with temporal data?
- Data indexed with time or dates
- Data about change, transformation and occurrences
- Time series data
Temporal phrases
Era: AD 2020, 20 Jan 2020 CE
* Calendar: Lunar, Hebrew, Chinese, etc.
* Time zone: 1pm AEST, 1pm UTC+10:00
* Submultiples: 13:00.001
* Years do not have the same number of days
* Months have different numbers of days
* It can be difficult to identify the day of the week, day of the month and week in the year
* Years and months start on different days
* Even specific time phrases can be very complicated to parse!
Counting time:
* Time is not decimal
* Months and years have different numbers of days
* Be careful how you compare time elements
* Socially, not all time periods are the same
➡ weekends
➡ holidays
➡ pay periods
Statistical Modelling
- Models represent aspects of a scenario to help us understand it.
- Statistical models represent the relationships between variables
➡ Independent variable(s)
➡ Dependent variable - A model can be used to predict about the dependent variable, given information about the independent variable(s)
- Rather than trying to use all data about the scenario, the model just reduces the data set to a low dimensional summary.
Causation & Correlation
Causation:
Causation indicates that one event is the result of the occurrence of the other event –> cause and effect (e.g. it rains –> the street is wet)
Correlation:
Correlation is a statistical measure that describes the size and direction of a relationship between two or more variables
–> does not automatically mean that the change in one variable is the cause of the change in the values of the other variable.
!!!
Causation implies correlation (normally); BUT, Correlation does not imply causation
!!!