Lecture 1: Understanding Big Data Flashcards
[..] is the process of examining data to find facts, relationships, patterns, insights and/or trends
Data Analysis is the process of examining data to find facts, relationships, patterns, insights and/or trends.
[..] is a discipline that includes the management of the complete data lifecycle, which encompasses collecting, cleansing, organising, storing, storing, analysing and governing data using highly scalable technologies.
Data Analytics is a discipline that includes the management of the complete data lifecycle, which encompasses collecting, cleansing, organising, storing, storing, analysing and governing data using highly scalable technologies.
[..] are carried out to answer questions about events that have already occurred.
- [.] complexity
- data collected through [.]
** Descriptive Analytics are carried out to answer questions about events that have already occurred.
- low complexity
- data collected through observations
Ex. what is the average mark of students in the class?
- answering the What? question
(** is important knowledge)
[.] Analytics aim to determine the [.] of phenomenon that occurred in the past using questions that focus on the [.] behind the event.
- answer the [.]? question
**Diagnostic Analytics aim to determine the CAUSE of phenomenon that occurred in the past using questions that focus on the REASON behind the event.
- answer the Why? question
Ex. Why is the average marks of students 60? is it because some are 80 and 40 or is it because most students’ marks are 60 ?
[.] Analytics are carried out in an attempt to determine the outcome of an event that might occur in the future.
**Predictive Analytics are carried out in an attempt to determine the outcome of an event that might occur in the future.
Ex. Will the average marks stay at 60% next year or will it rise/fall?
[.] Analytics build upon the results of predictive analytics by prescribing actions that should be taken. This kind of analytics can be used to gain an advantage or mitigate a risk.
**Prescriptive Analytics build upon the results of predictive analytics by prescribing actions that should be taken to maintain the prediction. This kind of analytics can be used to gain an advantage or mitigate a risk.
Ex: If the prediction says average marks will be 70% next year with no unforeseen circumstances taking place, what can we do to ensure that average marks will be 70% next year?
- this requires the application of advanced technology such as AI.
[..] can be used to improve business applications, consolidate data in data warehouses and analyse queries via a dashboard.
- [..]: e.g. when stock drops to the safety margin, the supermarket will automatically order more.
- [..] can be cloud or servers, etc.
- [.]: data visualisation
> […] dashboard: the key indicators must be aligned with the company’s objectives.
BI (Business Intelligence, part of Business Analytics) can be used to improve business applications, consolidate data in data warehouses and analyse queries via a dashboard.
- business application:
e.g. when checking out at the supermarket, the item will be scanned at the cash register. This will reduce the stock by 1 and if the stock drops to the safety margin, the supermarket will automatically order more.
- data warehouse: can be cloud or servers, etc.
- dashboard: data visualisation
> KPI (Key Performance Indicator) dashboard: the key indicators must be aligned with the company’s objectives.
[.] data conforms to a data model or scheme and is often in tabular form
e.g. Excel
- Structured data conforms to a data model or scheme and is often in tabular form
e. g. Excel
- Structured data conforms to a data model or scheme and is often in tabular form
- [.] data does not conforms to a data model or data scheme. It makes up 80% of the data within any given enterprise
e. g. video, audio, photos.
- [.] data does not conforms to a data model or data scheme. It makes up 80% of the data within any given enterprise
- Unstructured data does not conforms to a data model or data scheme. It makes up 80% of the data within any given enterprise
e. g. video, audio, photos.
- Unstructured data does not conforms to a data model or data scheme. It makes up 80% of the data within any given enterprise
- [..] data has a defined level of structure and consistency, but is not relational in nature
e. g. XML data, JSON data, sensor data (e.g. camera data within a certain radius)
- [..] data has a defined level of structure and consistency, but is not relational in nature
- Semi - Structured data has a defined level of structure and consistency, but is not relational in nature
e. g. XML data, JSON data, sensor data (e.g. camera data within a certain radius)
- Semi - Structured data has a defined level of structure and consistency, but is not relational in nature
5 V’s of Big Data?
5 V’s of Big Data:
Volume
Velocity (companies need information to flow quickly – as close to real-time as possible)
Variety
Veracity (Quality)
Value (ability to transform a tsunami of data into business)