Definitions Flashcards
Describe Azure Storage Accounts
- When you need a low cost, high throughput data store.
- When you need to store No-SQL data.
- When you do not need to query the data directly. No ad hoc query support.
- Suits the storage of archive or relatively static data.
- Suits acting as a HDInsight Hadoop data store.
Describe Data Lake Store
- When you need a low cost, high throughput data store.
- Unlimited storage for No-SQL data.
- When you do not need to query data directly. No ad hoc query support.
- Suits the storage of archive or relatively static data.
- Suits acting as a Databricks, HDInsight, and IoT data store
Describe Azure Databricks.
- Eases the deployment of a Spark based cluster.
- Enables the fastest processing of ML solutions.
- Enables collaboration between data engineers and data scientists.
- Provides tight enterprise security integration with Azure Active Directory.
- Integration with other Azure Services and Power BI.
Describe Azure Cosmos DB (Premium)
- Provides global distribution for both structured and unstructured data stores.
- Millisecond query response time.
- 99.999% availability of data.
- Worldwide elastic scale of both the storage and throughput.
- Multiple consistency levels to control data integrity with concurrency.
Describe Azure SQL Database.
- When you require a relational data store.
- When you need to manage transactional workloads.
- When you need to manage a high volume on inserts and reads.
- When you need a service that requires high concurrency.
- When you require a solution that can scale elastically.
Describe Azure Data Warehouse.
- When you require a relational data store.
- When you need to manage analytical workloads.
- When you need low cost storage.
- When you require the ability to pause and restart the compute.
- When you require a solution that can scale elastically.
Describe Azure Stream Analytics.
- When you require a fully managed event processing engine (utilizes Azure Event Hub)
- When you require temporal analysis of streaming data.
- Support for analyzing IoT streaming data.
- Support for analyzing application data through Event Hubs.
- Ease of use with Stream Analytics Query Language.
Describe Azure Data Factory.
- When you want to orchestrate the batch movement of data (pipelines).
- When you want to connect to a wide range of data platforms.
- When you want to transform or enrich the data in movement.
- When you want to integrate with SSIS packages.
- Enables verbose logging of data processing activities.
Describe Azure HDInsight (Hadoop, Open)
- When you need a low cost, high throughput data store.
- When you need to store No-SQL data.
- Suits acting as a Hadoop, Hbase, LLAP, or Kafka data store.
- Eases the deployment and management of clusters.
Describe Aure Data Catalog.
- When you require documentation of your data stores.
- When you require a multi user approach to documentation.
- When you need to annotate data sources with descriptive metadata.
- A fully managed cloud service whose users can discover the data sources.
- When you require a solution that can help business users understand their data.
Describe Azure Queues Storage
- Azure Queue Storage is a service for storing large numbers of messages.
- You access messages from anywhere in the world via authenticated calls using HTTP or HTTPS.
- A queue message can be up to 64 KB in size.
- A queue may contain millions of messages, up to the total capacity limit of a storage account. - - - Queues are commonly used to create a backlog of work to process asynchronously.
Describe Azure Tables.
- NoSQL key-value Storage
- Items are referred to as rows, fields are known as columns
- All rows in a table must have a key.
- No Concept of relationships
- Data will usually be denormalized
- Used for logging and performance monitoring
- Storing TBs of structured data, capable of serving web scale apps
- Datasets that do not require complex joins, foreign keys, or stored procedures
Describe Azure File Storage.
- Enables to create file share in the cloud (policy documents, etc.)
- Accessible by Windows, Linux, macOS
- Accessible SMB protocol or Network File System (NFS) protocol
- Ensure data is encrypted at rest, Server Message Block (SMB) protocol ensures data is encrypted in transit.
Describe Azure Disk Storage.
- VM uses disks to store OS, apps, data
- one VM can have on OS disk, and multiple Data disk, but one data disk can only be lined with one VM
- Both OS disk and data disk are virtual hard disks (VHDs)
- Unmanaged disk: create storage account, specify it when we create the disk. Not recommended.
- Managed disk: Azure creates and manages storage accounts (scalable, resiliency)
- Standard HDD/SSD, Premium SSD, Ultra SSD
Describe CosmosDB
- Serverless architecture, DaaS, no OPEX, no schema or index management, 5x 9s availability
- Multimodel (JSON, table graph, columnar), multi-language (Java, .NET, Python, Node.js, Javascript).
- Globally distributed, multi-model database, mission critical applications***
Describe Azure Data Lake Gen2.
- Very big container to store data.
- No limit to Data Lake storage.
- Stores structured, unstructured, batches, sensor data.
- Takes advantage of Blob storage and Hadoop together.
- Optimized for big data analytics.
- Supports multiple Azure integrations.
Describe Azure Blob Storage.
- Large object storage in the cloud.
- Optimized for storing mass amounts of unstructured data
- General purpose, cost efficient
Cosmos API SQL (Core)
- New projects being created from scratch.
- JSON Documents.
- Supports server side programming model.
Cosmos MongoDB API
- BSON Documents
- Fully compatible with Mongo DB application code
- Migrate existing Cosmos DB without much change of logic.
Cosmos DB Table API
- NoSQL Db
- Premium offering for Azure Table Storage
- Row cannot store object
Cosmos DB Cassandra API
- Wide column No SQL Db
- Name and format of column can vary from row to row.
- Migrate Cassandra application to Cosmos Cassandra API to change connection string.
Cosmos Db Gremlin API
- Graph data model!
- Real world data connected with each other
- Graph database can persist relationships in the storage layer
- No schema, no dependencies, relationships exist naturally, demonstrate how real-world objects are related
- Geospatial, Social networks, Recommendation engines, IoTs
Define Partition Key.
It is the value by which Azure organizes your data into logical divisions.
Define Logical Partitions.
They are formed based on the value of a partition key that is associated with each item in a container. (digital separation)