Rick Sheerman Flashcards

1
Q

Ad hoc query

A

—People use SQL to make ad hoc queries to a database when the need arises.
This is the opposite of predefined queries, which are performed routinely and known ahead
of time. Tools for ad hoc querying can help you manipulate data for analysis and report
creation. Most business people, however, do not really need ad hoc querying; they do fine with
interactive reporting and data discovery.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Hvad er Dashboards

A

This BI tool displays numeric and graphical informations on a single display, making it easy for a business person to get information from different sources and customize the appearance. This is often a mashup of other BI styles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data franchising

A

Packages data into a BI data store so business people can understand and
use it. Although this creates data that is redundant with what’s in the data warehouse, it is a controlled redundancy. The data stores may be dependent data marts or cubes. Data franchising
takes place after data preparation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data mart

A

A subset of a data warehouse that’s usually oriented to a business group or
process rather than enterprise-wide views. They have value as part of the overall enterprise data
architecture, but can cause problems when they sprout uncontrolled as data silos with their own
data definitions, creating data shadow systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data quality (5 C’s)

A

—Achieved when data embodies the “five Cs”: clean, consistent, conformed, current, and comprehensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data mining

A

This process analyzes large quantities of data to find patterns such as groups
of records, unusual records, and dependencies. Data mining helps businesses sift through data
to find patterns and relationships they do not yet know, such as “what is the likelihood that a
customer who buys our hammer will also buy our nails?”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data profiling

A

An essential part of the data quality process; this involves examining
source system data for anomalies in values, ranges, frequency, relationships, and other
characteristics that could hobble future efforts to analyze it. It enables early detection of
problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

MDM (Master data management)

A

—The set of processes used to create and maintain a
consistent view, also referred to as a master list, of key enterprise reference data. This data includes such entities as customers, prospects, suppliers, employees, products, services, assets, and accounts. It also includes the groupings and hierarchies associated with these
entities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Self service BI

A

—Intuitive tools that allow BI consumers to obtain the information they need
without the help of the IT group. People still need the IT group for the hard work of making the
data clean, correct, consistent, current, and comprehensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data star schema

A

er en måde at udvikel databaser på. Det er hvor der er en eller flere fact tabeller der udligere information til diminsionerne. Der kan være flere fact tabeller konnektet til den samme diminsion fx hvis diminsionen er dato(tid). Herfra er det manuel grundkode der sørger for at daten, kan indsættes i de forskellige dimensioner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Levels of Data

A
  • Business view / coneptual data model (top)
  • Architect view / Logical data (mid)
  • Developer view / physical (bottom, DW/mart)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Entities:
Attributes
Kardinalitet
Granualitet

A

Entities: Informations sted ie. salesorder = x

Attributes: are basically properties of entity ie. entitie beskriver dybde, vil attributen hede Dept.

Kardinalitet Hvordan attributter er sat sammen. ie. mange-til-mage, 1-til-1 osv.

Granualitet: level of detail

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Nøgler:
- primær
- Alternate
- Surrogate
- Foreign
- Candidate

A

Primary: nøgle der formegentlig er unik på tværs af tabellen; kun 1. Det gør man kan kalde resten i kolonnen i den tabel

Alt key: aldrig brugt. Når der er 2 der kan bruges som primaries. Vælges ie. cpr, eller au id.

surrogate: Ofte int (tal), refere til et sted og kun et sted. Giver den andet mnummer så det mere overskuligt

Foraign key viser hvor man skal kigge - nøgle der ligger i en anden tabel

Candidate key: kandidat til at være primær. Bruges i opbygning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Hvad gør, Normalization (normalform, 3NF), og pros/cons

A

reduce redundancy data

  • Pros: gir rent data; nemmere at definere, overskulige afhænigheder
  • Cons: manglende overskulighed og effektivitet
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Forældre (Relationship)

A

Bruger en foraign key til at henvise hvor dataen er gemt, for at opbygge et hierki.
Ligsom i MitHR kan man hvem der høre under hvem i et hiraki

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Event tabel

A

ja/nej eller 0/1 sortering

17
Q

Aggrevating

A

Aggregate data is high-level data which is acquired by combining individual-level data. For instance, the output of an industry is an aggregate of the firms’ individual outputs within that industry.

18
Q

Variable-depth hierchies

A

Ragged og unbalanced
ragged hierarchies: Når forksellige drill throughs ikke indeholder samme metadata katogorisering, så der er et hiraki med ‘manglene’ sub

  • i disse tilfælde brug NULL
    Unbalanced: has at least one branch which does not reach down to the lowest level.
19
Q

Hvilke elemnter hjælper en foranalyse med at afvikle?

A

• Holistic—avoid costly overlaps and inconsistencies.
• Incremental—more manageable and practical.
• Iterative—discover and learn from each individual project.
• Reusable—ensure consistency.
• Documented—identify data for reuse, and create leverage for future projects.
• Auditable—necessary for government regulations and industry standards.

Ved integrations stadiet er det vigtigt at holde øje for hvad den kommer til at blive brugt til; At have en foranalyse.

Det er vigtigt og have en plan. Ofte overskrider man den plan - Hvorfor? Det er nok for alle it løgsninger skal være skradersyet. Der er ikke mange standarter, og ofte vil man hellere betale lav etablering, som kommer til at koste i det lange løb (drfit, opretholdele). s.282