1A. Discover data analysis Flashcards
What are descriptive analytics?
Help answer questions about what has happened based on historical data. These techniques summarize large semantic models to describe outcomes to stakeholders.
For example KPIs, ROI, and reports to provide a view of an organization’s sales and financial data
What are diagnostic analytics?
Help answer questions about why events happened. Generally, this process occurs in three steps:
- Identify anomalies in the data. These anomalies might be unexpected changes in a metric or a particular market.
- Collect data that’s related to these anomalies.
- Use statistical techniques to discover relationships and trends that explain these anomalies.
What are predictive analytics?
Help answer questions about what will happen in the future. These techniques use historical data to identify trends and determine if they’re likely to recur. Techniques include a variety of statistical and machine learning techniques such as neural networks, decision trees, and regression.
Prescriptive analytics?
Help answer questions about which actions should be taken to achieve a goal or target. Techniques rely on machine learning as one of the strategies to find patterns in large semantic models. By analyzing past decisions and events, organizations can estimate the likelihood of different outcomes.
What are cognitive analytics?
Attempt to draw inferences from existing data and patterns, derive conclusions based on existing knowledge bases, and then add these findings back into the knowledge base for future inferences, a self-learning feedback loop. Help you learn what might happen if circumstances change and determine how you might handle these situations.
Inferences aren’t structured queries based on a rules database; rather, they’re unstructured hypotheses that are gathered from several sources and expressed with varying degrees of confidence. Effective analytics of this kind depend on machine learning algorithms, and will use several natural language processing concepts to make sense of previously untapped data sources, such as call center conversation logs and product reviews.
List different kinds of analytics
Descriptive
Diagnostic
Predictive
Prescriptive
Cognitive
List data roles
Business analyst
Data analyst
Data engineer
Data scientist
Database administrator
What is a business analyst?
Closer to the business than a data analyst and is a specialist in interpreting the data that comes from the visualization. Often, this role and a data analyst could be the responsibility of a single person.
What is a data analyst?
Enables businesses to maximize the value of their data assets through visualization and reporting tools such as Microsoft Power BI. Responsible for profiling, cleaning, and transforming data. Their responsibilities also include designing and building scalable and effective semantic models, and enabling and implementing the advanced analytics capabilities into reports for analysis. Works with the pertinent stakeholders to identify appropriate and necessary data and reporting requirements, and then they are tasked with turning raw data into relevant and meaningful insights.
Also responsible for the management of Power BI assets, including reports, dashboards, workspaces, and the underlying semantic models that are used in the reports. They are tasked with implementing and configuring proper security procedures, in conjunction with stakeholder requirements, to ensure the safekeeping of all Power BI assets and their data.
Work with data engineers to determine and locate appropriate data sources that meet stakeholder requirements. Work with the data engineer and database administrator to ensure that the analyst has proper access to the needed data sources. Also works with the data engineer to identify new processes or improve existing processes for collecting data for analysis.
What is a data engineer?
Provision and set up data platform technologies that are on-premises and in the cloud. They manage and secure the flow of structured and unstructured data from multiple sources. The data platforms that they use can include relational databases, nonrelational databases, data streams, and file stores. Data engineers also ensure that data services securely and seamlessly integrate across data platforms.
Primary responsibilities include the use of on-premises and cloud data services and tools to ingest, egress, and transform data from multiple sources. They collaborate with business stakeholders to identify and meet data requirements. They design and implement solutions.
While some alignment might exist in their tasks with those of a database administrator, their scope of work goes well beyond looking after a database and the server where it’s hosted and likely doesn’t include the overall operational data management.
As a data analyst, you would work closely with them in making sure that you can access the variety of structured and unstructured data sources because they will support you in optimizing semantic models, which are typically served from a modern data warehouse or data lake.
What is a data scientist?
Perform advanced analytics to extract value from data. Their work can vary from descriptive analytics to predictive analytics. Descriptive analytics evaluate data through a process known as exploratory data analysis (EDA). Predictive analytics are used in machine learning to apply modeling techniques that can detect anomalies or patterns. These analytics are important parts of forecast models. Some might work in the realm of deep learning, performing iterative experiments to solve a complex data problem by using customized algorithms.
What is a database administrator?
Implements and manages the operational aspects of cloud-native and hybrid data platform solutions that are built on Microsoft Azure data services and Microsoft SQL Server. They’re responsible for the overall availability and consistent performance and optimizations of the database solutions. They work with stakeholders to identify and implement the policies, tools, and processes for data backup and recovery plans.
This role is different from the role of a data engineer. This role monitors and manages the overall health of a database and the hardware that it resides on, whereas a data engineer is involved in the process of data wrangling, in other words, ingesting, transforming, validating, and cleaning data to meet business needs and requirements.
Also responsible for managing the overall security of the data, granting and restricting user access and privileges to the data as determined by business needs and requirements.
What are the five key areas of work for a data analyst?
Prepare
Model
Visualize
Analyze
Manage
Describe data preparation
The process of profiling, cleaning, and transforming your data to get it ready to model and visualize. It involves, among other things, ensuring the integrity of the data, correcting wrong or inaccurate data, identifying missing data, converting data from one structure to another or from one type to another, or even a task as simple as making data more readable.
It also involves understanding how you’re going to get and connect to the data and the performance implications of the decisions. When connecting to data, you need to make decisions to ensure that models and reports meet, and perform to, acknowledged requirements and expectations.
Privacy and security assurances are also important. These assurances can include anonymizing data to avoid oversharing or preventing people from seeing personally identifiable information when it isn’t needed. Alternatively, helping to ensure privacy and security can involve removing that data completely if it doesn’t fit in with the story that you’re trying to shape.
Describe data modelling
The process of determining how your tables are related to each other. This process is done by defining and creating relationships between the tables.