Information Architecture Flashcards

1
Q

Benefits of employing an architecture

A
  • Baseline for requirements
  • Easy development of new applications
  • Reuse architectural assets, products
  • Plattform for selecting new products (tools, apps)
  • Fewer decisions, hence speed
  • Set of architectural standards
  • Defines the business context for sustainable BI
  • Forces the business to think about the big picture
  • Enables analytics across a range of processes
  • Avoids premature rush to selecting products
    • Restrains the IT function, business power users
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The four BI architecture categories

A
  • Each of these categories will have sub layers
  • Requirements flow downward
  • Implementation flow upwards
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

of the four types of architecture, where is BPM used

A

information architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Questions that need answers when designing archictures

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

define information architecture

A

the structural design of shared information environments;

the art and science of organizing and labelling websites, intranets, online communities and software to support usability and findability;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Basic Process of Information Architecture

A
  • to gather data from inside and outside the enterprise
  • transform it into information that the business uses to operate its business today and to plan for the future.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

purpose of modelling

A

Data modelling is about defining the target data structures.

  • standardise the process
  • reproduce the process
  • increase efficiency
  • measure the process
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

define data integration

A
  • combines data from different data sources
  • provides users with a unified view of the data

Examples:

  • commercial: two similar companies need to merge their databases
  • scientific: combining research results from different bioinformatics repositories

Data integration appears with increasing frequency as the volume and the need to share existing data explodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data Integration Framework Building Blocks

A
  • This is a lot more than an Extract, Transform and Loadtool; ETL tools are only one element in a DIF.
  • Beware of magic bullets, panaceas and of people who tell you their latest tool will fix all your DI problems
  • As in a lot of Computing, we have the triangle of People, Process and Technology. Architecture and Standards are no less important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe Data Integration Frameworks (DIF)

A

A combination of architecture, processes, standards, people and tools used to transform enterprise data into information for tactical reporting and strategic analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data integration framework (DIF) information architecture

  • 6 step process
  • it’s purpose
A
  1. Take data from systems of record,
  2. integrate it
  3. put it in the EDW,
  4. extract from the EDW
  5. put into data marts or OLAP cubes
  6. apply BI and analytics

The objective of the architecture is to gather data from inside and outside the enterprise and transform it into information that the business uses to operate its business today and to plan for the future

Data is gathered, transformed using business rules and technical conversions, stored in databases tobemade available to business users for reporting and analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

2 stages in Data Integration

A
  • Data Preparation (collect)
  • Data Franchising (distribute)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Architecture Components

A
  • Data Preparation
  • Data Franchising
  • Business Intelligence and Analytics
  • Data Management
  • Metadata Management
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Architecture Component:

Data Preparation (6 steps)

A
  1. Gather
  2. Reformat
  3. Consolidate
  4. Transform
  5. Clean
  6. Store
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Architecture Component:

Data Franchising

A
  • Create information for reporting and analysis with BI tools.
  • Data further filtered, reorganised, transformed, summarised and/or aggregated, and stored
  • Copied from DW to business area data marts or cubes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Architecture Component:

Business Intelligence and Analytics

A

Deliver data to business users using BI applications

  • Reports, spreadsheets, alerts, graphics, analytic applications
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Architecture Component:

Data Management

A

Processes and standards used to define, govern and manage a company’s enterprise information assets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Architecture Component:

Metadata Management

A

Processes, procedures and policies that define and manage the metadata used by the DIF

19
Q

Define a Data Mart

A

The access layer of the data warehouse environment that is used to get data out to the users.

The data mart is usually oriented to a specific business line or team. Whereas data warehouses have an enterprise-wide depth, the information in data marts usually pertains to a single department or business area.

20
Q

Data Preparation Step 1: Gather Data

A

Part of data integration:

  • gather data from various internal and external sources
    • usually mix of custom, package, cloud applications
  • transform it according to business and technical rules
  • stage it for later steps where it becomes information used by business consumers.
    • Staging may not be in permanent physical files in every step of the process.
21
Q

Data Profiling

A

Data profiling is about understanding the data in the source system, before going through the data preparation phase.

  • Examine the structure, content of data sources
  • Perform source system analysis
  • Find anomalies, understand data quality
  • Feed into design of the data integration workflow
22
Q

Data Preparation Step 2: Reformat Data

A
  • Convert the data to a common format and schema
    • To be fed into a Data Warehouse
    • Straightforward if there are schema, column definitions for the source data
  • If not, you may need to discover them (use SME)
  • All governed by master data in the Reference or Dimension tables
23
Q

Database Schemas

A
  • Schema is the structure of the database that defines the objects in the database
  • In a relational database, the schema defines:
    • database’s tables, fields, relationships, indexes, database links, directories, XML schemas, and other elements.
    • Set of integrity constraints imposed on a database
24
Q

Data Preparation Step 3: Consolidate, Standardise, Validate Data

A
  • Provide a single, consistent definition for business users
  • Validate by checking dimensions or reference tables
    • To see if it conforms to specific business rules
    • Reference files are metadata you build up to describe the eventual Data Warehouse
25
Q

Data Preparation Step 4: Transform Data (+examples)

A
  • Business transformations turn data into business information
  • Apply business rules, algorithms, filters to put data into a business context
    • May also associate a business transaction in a dimensional context, such as the region, business divisionor product hierarchy it is associated with

Examples

  • Create summary tables
    • By week, by month, by year – historical data
    • By organisation - branch, area, region, country…
  • Apply calculations
    • Interest, Net Present Value, averages, etc.
  • Data Warehouse holds historical data
    • analysis of trends, etc.
26
Q

Data Preparation Step 5: Cleanse Data

A
  • Goal is to establish data consistency
  • Cleansing involves a more sophisticated analysis
    • e.g. name and address cleansing, customer householding
    • simpler data quality checking has already been done
  • Ouput is: good and cleaned records
  • IT people often send cleaned data back to source
    • Avoids need to clean it again
    • Reduces problems with dirty data
  • Can buy special purpose data cleansing tools
  • Particularly for customer data, e.g. name and address
    • John Doe, MrJ Doe, MrJohn Doe
    • 12 Main Street, 12 Main St, 12 Main Street, Suite 135
    • Easily distinguished visually; harder for ETL tools
  • Customer Householding
    • Another aspect of data cleansing
    • Link family members’ personal and business accounts or purchases
      • For customer convenience
        • no multiple brochures through post
      • For their own convenience in promoting their products
    • Done by retailers and financial services firms
27
Q

Data Preparation Step 6: Store Data

A
  • Store transformed, cleansed data in DW
  • Make it available for further processing
    • Either directly from DW or through data franchising
28
Q

Data Franchising

A

Next set of processes after Data Preparation

  • Takes data from the Enterprise Data Warehouse
  • Transforms it to information used by BI tools
  • Stores in data marts, OLAP cubes etc.
    • Making convenient for business analysts
29
Q

Reasons for Data Franchising

A
  • Give them the subset of data relevant to them
  • Apply rules, filters, transformations, aggregations that are specific to the business group or process
  • Makes the data more understandable to that business group
  • A key point here is that franchising takes from the EDW only the data needed by particular business teams
    • So business people can understand the data
  • Improves business and IT productivity
  • Enables self-service BI (no need of IT dep.)
  • Aggregations – may take many records and create aggregates
  • In particular, it is a unit for data manipulation and management of consistency
  • Many NoSQL databases use an aggregate data model
30
Q

Data Franchising Step 1: Gather, Filter & Subset data from the DW

A
  • Assumption – data preparation has already happened
    • so it’s consistent, conformed, clean, current
  • Filter it by rows, columns to get just what you need
  • Put it in a staging area – temporary store
31
Q

Data Franchising Step 2: Restructure or Denormalise Data

A

* Target schemas will likely be different from source

* Particularly if non-relational databases used * Define source to target mappings

Denormalise means:

  • Adding redundant copies of the data
  • Or grouping data
  • Done for performance reasons
  • Designer adds constraints to keep copies in sync

In computing, denormalization is the process of trying to improve the read performance of a database, at the expense of losing some write performance, by adding redundant copies of data or by grouping data. It is often motivated by performance or scalability in relational database software needing to carry out very large numbers of read operations.

32
Q

Data Franchising Step 3: Transformations and Calculations

A

Perform business transformation and metrics

  • calculations used by the specific business processes whose marts or cubes you are building
33
Q

Data Franchising Step 4: Aggregate or Summarise

A
  • Certain BI tools may need you to summarize or aggregate
  • Usually to improve response time
    • Especially with drill-down dashboards
34
Q

Data Franchising Step 5: Store Data

A

store data in data mart or OLAP cube

  • marts, cubes use persistent storage
  • because they’ll be around for a long time
35
Q

Reference or Dimension tables

A

Used to control the preparation and franchising

  • Referential integrity
  • Lookups and cross-maps
  • Business transformation
  • Business metric calculation
  • Query selection criteria
  • Aggregations
  • Report value bands
36
Q

list some BI applications

A

Lots of ways to consume the business intelligence…

37
Q

Data Management Processes

A

Processes and standards used to define, govern and manage a company’s enterprise information assets.

38
Q

Metadata

A
  • Description of the data as it is created, transformed, stored, accessed, consumed
  • Essential for data management
39
Q

Technical Metadata

A
  • Description of the data as it is processed
  • Databases: define columns, tables, indexes

Metadata is used by the software tools to understand and process the data:

  • ETL tools: fields, source-to-target transformations, workflows
  • BI tools: fields, reports
40
Q

Business Metadata

A
  • Description of the data from business perspective
    • e.g. inventory turns, weekly sales, budget variances
  • Most of the data relevant to the business is not used by software tools
41
Q

Operational BI and. Analytical BI

A
  • Analytical BI – business decisions
  • Operational BI – operational decisions
42
Q

Benefits of Operational BI

A
  • Essential for day-to-day running the business
  • Capture, monitor, report on business transactions
    • Operational BI often comes with business applications
  • Real-time data access and alerting of problems
  • Dashboards are often used
43
Q

Data mining

A
  • Process of discovering patterns in large data sets
  • Uses combination of AI, machine learning, statistics
    • automatic or semi-automatic analysis
  • Finds patterns, not data itself

The term “mining” is confusing, because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself.

44
Q

Data integration workflow

A