Master and Reference Data Flashcards

1
Q

Two types of master data platforms

A
  1. single domain
  2. mutli domain master data tool
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Single domain master data platform

A

focussed towards one area, comes with a power data model with lots of functionality associated to specific tasks/ systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Multi domain master data platform

A

informed by your data model, configure the tool base on your needs. Not the cutting edge, more general.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is event / transaction data?

A

Large volume data that identifies a transaction that took place. Data that describes/measures a verb. Identifies the nouns that were involved in the event e.g., person, item, location, date

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the four types of data?

A

metadata
reference data
master data
transaction data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is master and reference data?

A

defines and describes the nouns (things) of the business - contextual information about events transactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

WHat is master data management?

A

the ongoing reconciliation and maintenance of master data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Rate of change of reference vs master data

A

reference - low
master - frequent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Number of values of reference vs master data

A

reference - low and fixed
Master - medium/high and variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Source of reference vs master data

A

reference - external
master - internal
(mostly)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ownership of reference vs master data

A

reference - none
master - split between the business (federated out)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ease of governance of reference vs master data

A

reference - easy
master - harder (higher numbers of stakeholders)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Tool complexity/ cost of reference vs master data

A

reference - low
master - high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Business Drivers of master data management (MDM)

A
  • consistency & confidence of data (organisation data requirements)
  • Managing data quality
  • managing the cost of data integration (integrating new data sources)
  • reducing risk
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Managing data quality is an…

A

ongoing exercise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

MDM standard architecture types

A
  1. Repository
  2. Registry
  3. hybrid
  4. virtualised
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Immutable minimum (identifiera)

A

the minimum fields which HAVE to be populated in a DB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Core fields

A

the fields that are used the most for important business processes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Repository archtechture

A

All the fields in the central hub

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Hybrid archtiechture

A

Core fields in the central hub

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Registry architechture

A

identifiers (immutable minimum) in the central hub

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Virtualised architechture

A

none of the fields are stored in the central hub

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

System of origin

A

a contributing system, one that GIVES data to the MDM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Subscriber

A

A consuming system, one that TAKES data from the MDM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Golden Records

A

The best attempt at storing a record from all the contributing systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

System of Record

A

The system that stores the golden records/ master data (after being processed by the MDM).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

All master data is a golden record

A

true

28
Q

Typical MDM components

A
  • business rules, survivorship, conflict resolution
  • governance (someone with the authority to make changes, configure security etc)
  • caching & synchronisation (making sure data available across all systems)
  • data modelling
  • data quality
  • sourcing
  • access (security)
  • distribution
  • super session chaining (combining entities, to create a new entity)
  • transformation (between different systems)
29
Q

Curation zone

A

where standardising etc takes place

30
Q

Implementation styles

A

Registry
Consolidation
Co-existence
Centralised

31
Q

Registry implementation style

A

Low control, no single version of the truth
dotted lines

32
Q

Consolidation implenmetation style (analytical)

A

golden records are created in the master data hub (single version of the truth), only for reporting.

Solid line, single arrows

33
Q

Co-existence implementation style

A

golden records are created and written back into the original system. Each system has the latest version.
Solid line, double arrows

34
Q

Centralised implementation style (rare)

A

Master data is only given to the systems from the central hub, the systems can’t make changes. Ensures that records are consistent and golden.
Solid line, outward arrows

35
Q

Alternatives to MDM hubs

A
  • Synchronsied master
  • application specific master
  • master overlay
  • messaged based architechtures (real time data movement)
36
Q

Match rules

A
  • duplicate identification match rules (e.g., on NI number)
  • Match-merge rules (merge the data from multiple records)
  • match-link rules (one golden record that links multiple together)
37
Q

2 families of matching algorithms

A

deterministic and probabilistic

38
Q

Deterministic matching

A

exact string matches c=c

39
Q

probabilistic matching

A

fuzzy matching - high probability of the records being the same based on a weighting, with a SME review.

40
Q

True negative

A

When 2 or more records are not matched when they are not a correct match

41
Q

False negative

A

When 2 or more records are not matched when they ARE a correct match

42
Q

True positive

A

2 or more records are considered to be matched by the system, and they are a correct match

43
Q

False positive

A

2 or more records are considered to be matched by the system, and they are NOT a correct match

44
Q

Single Domain MDM example

A

Customer, Product, Vendor, Laboratory

45
Q

Multi Domain MDM examples

A

Generic data platforms e.g., Oracle

46
Q

Single Domain MDM tool

A

focussed on one specific type of data
with a very powerful data model and specific features related to that domain
interfaces into specific systems
extendable

47
Q

Multi Domain MDM tool

A
  • highly configurable
  • you give it your data model
  • fewer specific data domain features, standard processes, interfaces to mainstream apps
48
Q

A multi domain MDM tool results in…

A

fewer MDM solutions throughout the enterprise

49
Q

Analytical Master Data

A

e.g., registry or consolidated styles
Created the MD environment just for business intelligence, not being used by the live operational systems.
No essential to address all the MDM components

50
Q

Operational Master Data

A

e.g., centralise, co-existence implementation styles
MD is used in live business systems and operations, so essential to complete all the MDM components.

51
Q

Whats’s the best way to do MDM implementation

A

incrementally

52
Q

reference data used to classify or categorise other data

A

reference data

53
Q

By centralising the management of reference and master data the organisation can conform critical data needed for analysis

A

A reason for reference and master data management

54
Q

Master data management requires techniques for splitting or merging…

A

an instance of a business entity

55
Q

Business data steward maintain lists of valid data values for ____ data instances

A

reference

56
Q

What needs to be taken into account when deciding the integration approach for master data management?

A

The number of distinct Systems Of Record
The organisational structure of the business
The number of systems and applications
The Data Governance implementation

57
Q

Relationship between master data and reference data

A

reference data provides context for master data

58
Q

Relationship between transactional data and master/reference data

A

master/reference data provides context for transactional data

59
Q

Major challenge with master data

A

entity resolution

60
Q

Master Data Transaction Hub

A

big computer system that holds all the master data a company needs. This system is the only place where master data is stored and other computer systems have to talk to the transaction hub to get access to that information.

61
Q

consolidated approach

A

mix of the registry and transaction hub approaches.
each computer system in the company still manages its own master data. But a copy of that information is also stored in the central hub computer system.

62
Q

what activities are involved in entity resolution

A

Identity management
Reference extraction
Reference preparation
Reference resolution

63
Q

Which body should be in charge of ensuring policies and procedures are implemented in order to handle changes to data within the Reference and Master Data environment?

A

Data Governance Council

64
Q

How are reference data usually structured?

A

Lists, cross references or taxonomies

65
Q

Which approach to creating a Master Data hub has Master Data managed in local applications, which is then consolidated within a common repository and made available from a data-sharing hub?

A

consolidated

66
Q

Matching is the process of identifying how different records may relate to a single entity. One approach is to analyse the similarity between 2 records using defined rules and patterns to assign weights and scores that help determine the similarity. This is known as:

A

deterministic approach