Keywords Flashcards
Data Warehouse
A centralized repository for storing and managing large volumes of data from various sources, used for querying, reporting, and data analysis to support business intelligence activities.
Multidimensional Data
Data organized into multiple dimensions, allowing for analysis from different perspectives, such as time, geography, and product categories.
OLAP
Online analytical processing (OLAP) is a server that allows businesses to analyze data from multiple sources in different ways. It is a database technology that is optimized for querying and reporting, instead of processing transactions.
What is a server?
In computing, a server is a computer that provides services, data, or resources to other devices, called clients, over a network.
What is data mining?
The process of searching and analyzing a large batch of raw data in order to identify patterns and extract useful information.
What is Operational BI?
Operational Business Intelligence provides near real-time to short-term insights focused on daily operations, enabling immediate decision-making rather than long-term strategic planning.
What is analytics?
The methodology of transforming data into insight for making better decisions. It answers questions like why is this happening, what if these trends continue, what will happen next (i.e., predict), and what is the best that can happen.
What is web analytics?
The measurement, collection, analysis, and reporting of web data
Explain the data analytics process
Identify business needs (choose KPIs)
Collect the data (from the subject matter experts)
Review and clean the data
Model the data
Analyze the data
Interpret the results
Predict and optimize
Communicate
What is the Gartner Magic Quadrant for BI:
An annual report that evaluates and ranks leading business intelligence (BI) vendors based on their ability to execute and their completeness of vision, helping organizations choose the best BI tools for their needs.
What is structured data?
It is highly organized data stored in databases and spreadsheets in columns and rows. Ready to be integrated into a database of a structured file format such as XML. Only 20% of available data is like this.
What is unstructured data?
Unstructured data is raw and unorganized. It does not have a predefined data model. It has no identifiable internal structure.
What are three components of a data warehouse system?
Acquisition Component: Interfaces with source systems to import data into the data warehouse. TL Tools (Extract, Transform, Load): Examples include Apache Nifi, Talend, and Microsoft SQL Server Integration Services (SSIS), which help import data from various sources into the data warehouse.
Storage Component: A large physical database used to store the imported data. Databases: Examples include Amazon Redshift, Google BigQuery, and Snowflake, which store large volumes of data in a structured format.
Access Component: Enables accessing and analyzing the data in the data warehouse. BI Tools: Examples include Tableau, Power BI, and Looker, which allow users to access, query, and analyze data stored in the data warehouse.
What are some of the benefits of metadata?
- Has the source of information about operations that were applied on the imported data
- It documents relationships between data structures
- It provides useful mapping information
- It can be used to review how the business definitions and calculations changed over time and, also, provides a history of extracts and changes in data over time
What is business metadata?
Provides information about the data, its sources, definitions, etc. in business terminology
What is Technical metadata?
It defines the objects and processes in the data warehouse
What is process metadata?
It documents the data warehouse operations.
What is access metadata?
Access metadata provides the dynamic link between a data warehouse and its associated applications.
What is data conversion?
Refers to the process of converting data from one format into another due to differences in storage types and data structures, as well as variations in data encoding across computer systems
What is data integration?
Data Integration: Imagine you work with data from multiple sources, like sales records, customer information, and inventory lists. Data integration is the process of combining this data into a single, unified view. It’s like creating a master spreadsheet where all this information is brought together, making it easier to analyze and share with your team or partners.
What is data migration?
Data Migration: Data migration is when you move data from one system to another. For example, if your company is switching from an old database to a new one, you would transfer all the data from the old system to the new one. Once the data is successfully moved, the old system is no longer needed and can be retired.
What is data quality
How accurate, complete, reliable, and relevant data is for its intended use, ensuring that data is consistent, free of errors, and useful for making decisions and analysis.
What is Master Data Management?
Master Data Management (MDM): MDM focuses on creating a single, consistent, and accurate view of key business data entities, such as customers, products, and suppliers. It involves processes and tools for integrating, cleansing, and maintaining this master data across different systems and departments to ensure consistency and reliability.
What are some data cleansing and tool categories?
Data error discovery tools and data correction tools