Roles and services Flashcards
Database Administrator
responsible for the design, implementation, maintenance, and operational aspects of on-premises and cloud-based database systems.
Data Engineer
A data engineer collaborates with stakeholders to design and implement data-related workloads, including data ingestion pipelines, cleansing and transformation activities, and data stores for analytical workloads.
They use a wide range of data platform technologies, including relational and non-relational databases, file stores, and data streams.
Data Analyst
A data analyst enables businesses to maximize the value of their data assets.
They’re responsible for exploring data to identify trends and relationships, designing and building analytical models, and enabling advanced analytics capabilities through reports and visualizations.
A data analyst processes raw data into relevant insights based on identified business requirements to deliver relevant insights.
Azure SQL Database
a fully managed platform-as-a-service (PaaS) database hosted in Azure
Azure SQL Managed Instance
–a hosted instance of SQL Server with automated maintenance, which allows more flexible configuration than Azure SQL DB but with more administrative responsibility for the owner.
Azure SQL VM
a virtual machine with an installation of SQL Server, allowing maximum configurability with full management responsibility.
Azure Database for MySQL
a simple-to-use open-source database management system that is commonly used in Linux, Apache, MySQL, and PHP (LAMP) stack apps.
Azure Database for MariaDB
a newer database management system, created by the original developers of MySQL. The database engine has since been rewritten and optimized to improve performance. MariaDB offers compatibility with Oracle Database (another popular commercial database management system).
Azure Database for PostgreSQL
a hybrid relational-object database. You can store data in relational tables, but a PostgreSQL database also enables you to store custom data types, with their own non-relational properties.
Azure Cosmos DB
is a global-scale non-relational (NoSQL) database system that supports multiple application programming interfaces (APIs), enabling you to store and manage data as JSON documents, key-value pairs, column-families, and graphs.
Azure Storage
Blob containers - scalable, cost-effective storage for binary files.
File shares - network file shares such as you typically find in corporate networks.
Tables - key-value storage for applications that need to read and write data values quickly.
Azure Data Factory
Azure service that enables you to define and schedule data pipelines (build extract, transform, and load (ETL) solutions that populate analytical data stores)
You can integrate your pipelines with other Azure services
Azure Synapse Analytics
Azure Synapse Analytics is a comprehensive, unified data analytics solution that provides a single service interface for multiple analytical capabilities, including:
Pipelines - based on the same technology as Azure Data Factory.
SQL - a highly scalable SQL database engine, optimized for data warehouse workloads.
Apache Spark - an open-source distributed data processing system that supports multiple programming languages and APIs, including Java, Scala, Python, and SQL.
Azure Synapse Data Explorer - a high-performance data analytics solution that is optimized for real-time querying of log and telemetry data using Kusto Query Language (KQL).
Azure Databricks
is an Azure-integrated version of the popular Databricks platform, which combines the Apache Spark data processing platform with SQL database semantics and an integrated management interface to enable large-scale data analytics.
Azure HDInsight
is an Azure service that provides Azure-hosted clusters for popular Apache open-source big data processing technologies, including:
Apache Spark Apache Hadoop Apache HBase Apache Kafka Apache Storm