Rick Sheerman Flashcards
Ad hoc query
—People use SQL to make ad hoc queries to a database when the need arises.
This is the opposite of predefined queries, which are performed routinely and known ahead
of time. Tools for ad hoc querying can help you manipulate data for analysis and report
creation. Most business people, however, do not really need ad hoc querying; they do fine with
interactive reporting and data discovery.
Hvad er Dashboards
This BI tool displays numeric and graphical informations on a single display, making it easy for a business person to get information from different sources and customize the appearance. This is often a mashup of other BI styles.
Data franchising
Packages data into a BI data store so business people can understand and
use it. Although this creates data that is redundant with what’s in the data warehouse, it is a controlled redundancy. The data stores may be dependent data marts or cubes. Data franchising
takes place after data preparation
Data mart
A subset of a data warehouse that’s usually oriented to a business group or
process rather than enterprise-wide views. They have value as part of the overall enterprise data
architecture, but can cause problems when they sprout uncontrolled as data silos with their own
data definitions, creating data shadow systems
Data quality (5 C’s)
—Achieved when data embodies the “five Cs”: clean, consistent, conformed, current, and comprehensive.
Data mining
This process analyzes large quantities of data to find patterns such as groups
of records, unusual records, and dependencies. Data mining helps businesses sift through data
to find patterns and relationships they do not yet know, such as “what is the likelihood that a
customer who buys our hammer will also buy our nails?”
Data profiling
An essential part of the data quality process; this involves examining
source system data for anomalies in values, ranges, frequency, relationships, and other
characteristics that could hobble future efforts to analyze it. It enables early detection of
problems.
MDM (Master data management)
—The set of processes used to create and maintain a
consistent view, also referred to as a master list, of key enterprise reference data. This data includes such entities as customers, prospects, suppliers, employees, products, services, assets, and accounts. It also includes the groupings and hierarchies associated with these
entities.
Self service BI
—Intuitive tools that allow BI consumers to obtain the information they need
without the help of the IT group. People still need the IT group for the hard work of making the
data clean, correct, consistent, current, and comprehensive.
Data star schema
er en måde at udvikel databaser på. Det er hvor der er en eller flere fact tabeller der udligere information til diminsionerne. Der kan være flere fact tabeller konnektet til den samme diminsion fx hvis diminsionen er dato(tid). Herfra er det manuel grundkode der sørger for at daten, kan indsættes i de forskellige dimensioner.
Levels of Data
- Business view / coneptual data model (top)
- Architect view / Logical data (mid)
- Developer view / physical (bottom, DW/mart)
Entities:
Attributes
Kardinalitet
Granualitet
Entities: Informations sted ie. salesorder = x
Attributes: are basically properties of entity ie. entitie beskriver dybde, vil attributen hede Dept.
Kardinalitet Hvordan attributter er sat sammen. ie. mange-til-mage, 1-til-1 osv.
Granualitet: level of detail
Nøgler:
- primær
- Alternate
- Surrogate
- Foreign
- Candidate
Primary: nøgle der formegentlig er unik på tværs af tabellen; kun 1. Det gør man kan kalde resten i kolonnen i den tabel
Alt key: aldrig brugt. Når der er 2 der kan bruges som primaries. Vælges ie. cpr, eller au id.
surrogate: Ofte int (tal), refere til et sted og kun et sted. Giver den andet mnummer så det mere overskuligt
Foraign key viser hvor man skal kigge - nøgle der ligger i en anden tabel
Candidate key: kandidat til at være primær. Bruges i opbygning
Hvad gør, Normalization (normalform, 3NF), og pros/cons
reduce redundancy data
- Pros: gir rent data; nemmere at definere, overskulige afhænigheder
- Cons: manglende overskulighed og effektivitet
Forældre (Relationship)
Bruger en foraign key til at henvise hvor dataen er gemt, for at opbygge et hierki.
Ligsom i MitHR kan man hvem der høre under hvem i et hiraki