Data Platforms Flashcards
Data-Driven Innovation
Refers to the use of analytics to drive innovation and business value from data
Analytics
In this context, we mean the different types of business intelligence initiatives
Advanced Analytics
Semi-autonomous examination of data to get deeper insights (Machine Learning)
Augmented Analytics
Augment how people explore data with the incorporation of AI
Database
Structured and persistent collection of information with efficient retrieval and modification (relational databases)
Data Warehouse
Subject oriented collection of data that supports decision making processes
OLTP
Constant queries and updates, short term data retention. (Accounting database, online retail transactions)
OLAP
Periodic large updates, complex queries for reporting/decision support
Data Lake
Central repository system where data is kept in various original formats, unstructured, semi-structured, structured and queried only when needed.
Supports storage, processing and analysis
What kind of users use Data Warehouses vs Data Lakes
Business analysts
Vs
Data scientists, data developers, and business analysts
What kind of users use Data Warehouses vs Data Lakes
Business analysts
Vs
Data scientists, data developers, and business analysts
Data Platform
Meets end-to-end data needs such as acquisition, storage, preparation, delivery, governance and security so users ONLY focus on functional aspects
How do we prevent DP from becoming a swamp?
We MUST govern data transformations and leverage metadata and maintenance to keep control over data
What are 5 areas of data management? (PCPED) Plankton chokes Patrick every day
- Data provenance
- Compression
- Data profiling
- Entity resolution
- Data versioning
Data Provenance
Descriptions of origins of data and process by which it arrives