Chapter 6: Foundations of Business Intelligence - Data Bases & Information Management Flashcards
File organisation concepts/hierarchy
- database: group of related files
- file: group of records of same type
- record: group of related fields
- field: group of characters as word(s), or number(s) (rows etc.)
- entity: place, person, thing on which we store information
- attribute: each characteristic, or quality, describing entity
Data hierarchy
Bit (0s & 1s) -> Byte (one character) -> Field -> Record -> File -> Database
Problems with traditional file environment
- data redundancy (presence of duplicate data in multiple files)
- data inconsistency (same attribute has different values)
- program-data dependence (when changes in program requires change to data accessed by program)
- lack of flexibility (cannot deliver ad hoc reports or quickly respond to unexpected info requirements)
- poor security
- lack of data sharing and availability
traditional file processing
- encourages each functional area in a corporation to develop specialised applications
- each application requires a unique data file that is likely a subset of the master file
- these subsets of the master file lead to data redundancy, inconsistency, processing inflexibility, and wasted storage resources
Database Management Systems (DBMSs)
system to manage databases
- solves problems of tradition file environment
Relational DBMS
represent data as two-dimensional tables
- each table (referred to as file) contains data on entity and attributes
- each row represents a record, each column an attribute or field
Table: grid of columns and rows
- rows (tuples): records for different entities
- fields (columns): represents attribute for entity
- key field: field used to uniquely identify each record
- primary key: field in table used for key fields (employee ID)
- foreign key: primary key used in second table as look-up field to identify records from original table
Operations of a Relational DBMS
three basic operations used to to develop useful sets of data (commands)
- SELECT: creates subset of data of all records that meet stated criteria
- JOIN: combines relational tables to provide user with more information than available in individual tables
- PROJECT: creates subset of columns in table, creating tables with only the information specified
Designing a database requires a
- logical design: models the database from a business perspective, reflects its key business processes and decision-making requirements
- physical design: shows how the database is actually arranged on direct-access storage devices
Normalization of a relational database
process of creating small, stable, flexible, and adaptive data structures from complex groups of data when designing a relational database
Entity-relationship diagrams (ERD)
graphically depict the relationship between entities (tables) in a relational database
Online Analytical Processing (OLAP)
represents relationships among data as a multidimensional structure, which can be visualised as cubes of data and cubes within cubes of data, enabling more sophisticated data analysis
Data Mining
analyzes large pools of data (including data warehouses) to find patterns and rule to predict future behavior and assist decision-making
Text mining tools
help businesses analyse large unstructured data sets consisting of text
Web mining tools
focus on analysis of useful patterns and information from the web, examining the structure and activities of website users