Information Architecture Flashcards

Question 1

Q

Benefits of employing an architecture

Answer

A

Baseline for requirements
Easy development of new applications
Reuse architectural assets, products
Plattform for selecting new products (tools, apps)
Fewer decisions, hence speed
Set of architectural standards
Defines the business context for sustainable BI
Forces the business to think about the big picture
Enables analytics across a range of processes
Avoids premature rush to selecting products
- Restrains the IT function, business power users

Question 2

Q

The four BI architecture categories

Answer

A

Each of these categories will have sub layers
Requirements flow downward
Implementation flow upwards

Question 3

Q

of the four types of architecture, where is BPM used

Answer

A

information architecture

Question 4

Q

Questions that need answers when designing archictures

Question 5

Q

define information architecture

Answer

A

the structural design of shared information environments;

the art and science of organizing and labelling websites, intranets, online communities and software to support usability and findability;

Question 6

Q

Basic Process of Information Architecture

Answer

A

to gather data from inside and outside the enterprise
transform it into information that the business uses to operate its business today and to plan for the future.

Question 7

Q

purpose of modelling

Answer

A

Data modelling is about defining the target data structures.

standardise the process
reproduce the process
increase efficiency
measure the process

Question 8

Q

define data integration

Answer

A

combines data from different data sources
provides users with a unified view of the data

Examples:

commercial: two similar companies need to merge their databases
scientific: combining research results from different bioinformatics repositories

Data integration appears with increasing frequency as the volume and the need to share existing data explodes.

Question 9

Q

Data Integration Framework Building Blocks

Answer

A

This is a lot more than an Extract, Transform and Loadtool; ETL tools are only one element in a DIF.
Beware of magic bullets, panaceas and of people who tell you their latest tool will fix all your DI problems
As in a lot of Computing, we have the triangle of People, Process and Technology. Architecture and Standards are no less important

Question 10

Q

Describe Data Integration Frameworks (DIF)

Answer

A

A combination of architecture, processes, standards, people and tools used to transform enterprise data into information for tactical reporting and strategic analysis

Question 11

Q

Data integration framework (DIF) information architecture

6 step process
it’s purpose

Answer

A

Take data from systems of record,
integrate it
put it in the EDW,
extract from the EDW
put into data marts or OLAP cubes
apply BI and analytics

The objective of the architecture is to gather data from inside and outside the enterprise and transform it into information that the business uses to operate its business today and to plan for the future

Data is gathered, transformed using business rules and technical conversions, stored in databases tobemade available to business users for reporting and analysis

Question 12

Q

2 stages in Data Integration

Answer

A

Data Preparation (collect)
Data Franchising (distribute)

Question 13

Q

Architecture Components

Answer

A

Data Preparation
Data Franchising
Business Intelligence and Analytics
Data Management
Metadata Management

Question 14

Q

Architecture Component:

Data Preparation (6 steps)

Answer

A

Gather
Reformat
Consolidate
Transform
Clean
Store

Question 15

Q

Architecture Component:

Data Franchising

Answer

A

Create information for reporting and analysis with BI tools.
Data further filtered, reorganised, transformed, summarised and/or aggregated, and stored
Copied from DW to business area data marts or cubes

Question 16

Q

Architecture Component:

Business Intelligence and Analytics

Answer

A

Deliver data to business users using BI applications

Reports, spreadsheets, alerts, graphics, analytic applications

Question 17

Q

Architecture Component:

Data Management

Answer

A

Processes and standards used to define, govern and manage a company’s enterprise information assets

Question 18

Q

Architecture Component:

Metadata Management

Answer

A

Processes, procedures and policies that define and manage the metadata used by the DIF

Question 19

Q

Define a Data Mart

Answer

A

The access layer of the data warehouse environment that is used to get data out to the users.

The data mart is usually oriented to a specific business line or team. Whereas data warehouses have an enterprise-wide depth, the information in data marts usually pertains to a single department or business area.

Question 20

Q

Data Preparation Step 1: Gather Data

Answer

A

Part of data integration:

gather data from various internal and external sources
- usually mix of custom, package, cloud applications
transform it according to business and technical rules
stage it for later steps where it becomes information used by business consumers.
- Staging may not be in permanent physical files in every step of the process.

Question 21

Q

Data Profiling

Answer

A

Data profiling is about understanding the data in the source system, before going through the data preparation phase.

Examine the structure, content of data sources
Perform source system analysis
Find anomalies, understand data quality
Feed into design of the data integration workflow

Question 22

Q

Data Preparation Step 2: Reformat Data

Answer

A

Convert the data to a common format and schema
- To be fed into a Data Warehouse
- Straightforward if there are schema, column definitions for the source data
If not, you may need to discover them (use SME)
All governed by master data in the Reference or Dimension tables

Question 23

Q

Database Schemas

Answer

A

Schema is the structure of the database that defines the objects in the database
In a relational database, the schema defines:
- database’s tables, fields, relationships, indexes, database links, directories, XML schemas, and other elements.
- Set of integrity constraints imposed on a database

Question 24

Q

Data Preparation Step 3: Consolidate, Standardise, Validate Data

Answer

A

Provide a single, consistent definition for business users
Validate by checking dimensions or reference tables
- To see if it conforms to specific business rules
- Reference files are metadata you build up to describe the eventual Data Warehouse

Question 25

Q

Data Preparation Step 4: Transform Data (+examples)

Answer

A

Business transformations turn data into business information
Apply business rules, algorithms, filters to put data into a business context
- May also associate a business transaction in a dimensional context, such as the region, business divisionor product hierarchy it is associated with

Examples

Create summary tables
- By week, by month, by year – historical data
- By organisation - branch, area, region, country…
Apply calculations
- Interest, Net Present Value, averages, etc.
Data Warehouse holds historical data
- analysis of trends, etc.

Question 26

Q

Data Preparation Step 5: Cleanse Data

Answer

A

Goal is to establish data consistency
Cleansing involves a more sophisticated analysis
- e.g. name and address cleansing, customer householding
- simpler data quality checking has already been done
Ouput is: good and cleaned records
IT people often send cleaned data back to source
- Avoids need to clean it again
- Reduces problems with dirty data
Can buy special purpose data cleansing tools
Particularly for customer data, e.g. name and address
- John Doe, MrJ Doe, MrJohn Doe
- 12 Main Street, 12 Main St, 12 Main Street, Suite 135
- Easily distinguished visually; harder for ETL tools
Customer Householding
- Another aspect of data cleansing
- Link family members’ personal and business accounts or purchases
  - For customer convenience
    - no multiple brochures through post
  - For their own convenience in promoting their products
- Done by retailers and financial services firms

Question 27

Q

Data Preparation Step 6: Store Data

Answer

A

Store transformed, cleansed data in DW
Make it available for further processing
- Either directly from DW or through data franchising

Question 28

Q

Data Franchising

Answer

A

Next set of processes after Data Preparation

Takes data from the Enterprise Data Warehouse
Transforms it to information used by BI tools
Stores in data marts, OLAP cubes etc.
- Making convenient for business analysts

Question 29

Q

Reasons for Data Franchising

Answer

A

Give them the subset of data relevant to them
Apply rules, filters, transformations, aggregations that are specific to the business group or process
Makes the data more understandable to that business group
A key point here is that franchising takes from the EDW only the data needed by particular business teams
- So business people can understand the data
Improves business and IT productivity
Enables self-service BI (no need of IT dep.)
Aggregations – may take many records and create aggregates
In particular, it is a unit for data manipulation and management of consistency
Many NoSQL databases use an aggregate data model

Question 30

Q

Data Franchising Step 1: Gather, Filter & Subset data from the DW

Answer

A

Assumption – data preparation has already happened
- so it’s consistent, conformed, clean, current
Filter it by rows, columns to get just what you need
Put it in a staging area – temporary store

Question 31

Q

Data Franchising Step 2: Restructure or Denormalise Data

Answer

A

* Target schemas will likely be different from source

* Particularly if non-relational databases used * Define source to target mappings

Denormalise means:

Adding redundant copies of the data
Or grouping data
Done for performance reasons
Designer adds constraints to keep copies in sync

In computing, denormalization is the process of trying to improve the read performance of a database, at the expense of losing some write performance, by adding redundant copies of data or by grouping data. It is often motivated by performance or scalability in relational database software needing to carry out very large numbers of read operations.

Question 32

Q

Data Franchising Step 3: Transformations and Calculations

Answer

A

Perform business transformation and metrics

calculations used by the specific business processes whose marts or cubes you are building

Question 33

Q

Data Franchising Step 4: Aggregate or Summarise

Answer

A

Certain BI tools may need you to summarize or aggregate
Usually to improve response time
- Especially with drill-down dashboards

Question 34

Q

Data Franchising Step 5: Store Data

Answer

A

store data in data mart or OLAP cube

marts, cubes use persistent storage
because they’ll be around for a long time

Question 35

Q

Reference or Dimension tables

Answer

A

Used to control the preparation and franchising

Referential integrity
Lookups and cross-maps
Business transformation
Business metric calculation
Query selection criteria
Aggregations
Report value bands

Question 36

Q

list some BI applications

Answer

A

Lots of ways to consume the business intelligence…

Question 37

Q

Data Management Processes

Answer

A

Processes and standards used to define, govern and manage a company’s enterprise information assets.

Question 38

Q

Metadata

Answer

A

Description of the data as it is created, transformed, stored, accessed, consumed
Essential for data management

Question 39

Q

Technical Metadata

Answer

A

Description of the data as it is processed
Databases: define columns, tables, indexes

Metadata is used by the software tools to understand and process the data:

ETL tools: fields, source-to-target transformations, workflows
BI tools: fields, reports

Question 40

Q

Business Metadata

Answer

A

Description of the data from business perspective
- e.g. inventory turns, weekly sales, budget variances
Most of the data relevant to the business is not used by software tools

Question 41

Q

Operational BI and. Analytical BI

Answer

A

Analytical BI – business decisions
Operational BI – operational decisions

Question 42

Q

Benefits of Operational BI

Answer

A

Essential for day-to-day running the business
Capture, monitor, report on business transactions
- Operational BI often comes with business applications
Real-time data access and alerting of problems
Dashboards are often used

Question 43

Q

Data mining

Answer

A

Process of discovering patterns in large data sets
Uses combination of AI, machine learning, statistics
- automatic or semi-automatic analysis
Finds patterns, not data itself

The term “mining” is confusing, because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself.

Question 44

Q

Data integration workflow