Module 17 - Data Analytics Flashcards
Four Vs of big data
volume velocity veracity and variety
Data tagging
assists finding data later e.g. timestamp or location etc
What must the data retention policy of a company clarify
How long data is to be stored for, how it is stored and the security associated with it
Three levels of modelling
conceptual
logical
physical
Conceptual modelling
Shows mapping between information - also known as data normalisation which determines how data is stored in a database
Logical modelling
describes the actual tables and columns to be used in the system
Physical modelling
Describes the storage of the data - most detailed level and deals with technical details
DBMSs
Database management systems which is software designed for the purpose of storing and managing data
Storage area network
a group of data storage devices networked together so that location doesn’t matter
Data Mining
A way of identifying patterns or trends from a large data set - uses algorithms to predict likely outcomes based on historical information
6 steps of KDD
- business understanding
- data understanding
- data preparation
- data mining/modelling
- evaluation
- deployment