Cloud Understanding / Terms / CDPs Flashcards
BigQuery (BQ)
CDW
-definition included
Cloud Data Warehouses
A cloud data warehouse is a database stored as a managed service in a public cloud and optimized for scalable Business Intelligence and analytics. Cloud data warehouses live one layer above the infrastructure data warehouse layer. i.e. Snowflake doesn’t own any data warehouse infrastructure and is a true CDW. AWS has the data warehouse infrastructure and acts as a CDW (Amazon Redshift)
Why do clients choose to use CDWs?
Business Intelligence tools, real-time analytics, faster data movement, single ecosystem for most business needs which requires less data movement making them more secure. Offers increased scale for storage and computing purposes.
Increased flexibility as they’re accessible from anywhere.
Machine Learning and AI capabilities
AWS
Information included
Amazon Web Services
The largest cloud infrastructure and most used in the world
70-75% of all activity on their servers
Name of AWS’s CDW
RedShift
GCP
Google Cloud Platform
Data Lakes
Snowflake
GSIs
Global Systems Integrator
SIs
Systems Integrator
Querying data in cloud
The process of accessing and manipulating data that is stored in cloud based data storage services, using query languages like SQL (structured query language).
Cloud infrastructures offer insane amounts of computing power that wouldn’t be possible on a local device. This enables users to run wide-spreading query languages across an entire enterprise’s ecosystem in real time. It enables large enterprises to move faster with vast data sets.
Data Federation
It’s a technology that allows partners to collaborate across preferred data environments while minimizing the movement and copying of data.
Data federation proves that the future of data collaboration can serve a multiparty ecosystem that enjoys collaboration across clouds, channels and screens.
For example, let’s say a brand wants to connect its conversion data with a publisher’s exposure data in order to better understand media effectiveness and ROI, but each uses a different cloud for data storage. Historically, these partners both would’ve needed to copy and upload their data into one, centralized environment for measurement and analysis. Through federated collaboration, they can connect their data sets and unify identities all while keeping their data “at home.”
- Think opt outs - unsubscribing from someone’s database
- legally you’re required to remove them from all your datasets etc. Extremely challenging unless you have a single source of truth
CDPs + Definition
Customer Data Plaform
Top CDPs
Twilio
Salesforce
Treasure Data
Bloomreach
Adobe Real-Time CDP
+ more
Lytics
CDP - expand here: