Tools Flashcards
Big data enterprise data warehouse that processes data in parallel across 60 compute nodes
SQL data warehouse
Cloud service that allows etl of big data. It stores pipeline data for 45 days.
Data factory
Platform based on Apache spark clusters that will allow you to create big data workflows
Azure databricks
Can visualise data using python, scala, r and sql
Azure databricks
Store multiple data format types for consumption, up to petaByte
Azure data lake
Convert different incoming formats to a normalised relational format
Azure Polybase
Large scala data warehousing solution that is used for large scale queries and big data analytics
Synapse analytics
This is a big data store that supports data of any type and size. It supports receiving data in batches
Azure data lake
Data can be kept in a year and month-based hierarchical structure
Azure data lake storage gen2
Creates a dedicated link between on-prem and Azure data centers. This improves performance when copying data to Azure
ExpressRoute
Supports real-time data ingestion for real-time processing
Azure event hubs