Ch 1 Flashcards
Data vs information
Data are raw facts. Information is the result of processing raw data to reveal its meaning.
Data are the foundation of information, which is the bedrock of knowledge, i.e., the body of information and facts about a specific object.
Data management
Data management is a discipline that focuses on the proper generation, storage and retrieval of data.
Database
A database is a shared, integrated computer structure that stores a collection of the following:
- End-user data, i.e. raw facts of interest to the end user.
- Metadata, or data about data, through which the end-user data are integrated and managed.
Database management system (DBMS)
A database management system (DBMS) is a collection of programs that manages the database structure and controls access to the data stored in the database.
Advantages of the DBMS
Improved data sharing, improved data security, better data integration, minimized data inconsistency, improved data access, improved decision making, increased end-user productivity
Data inconsistency
Data inconsistency exists when different versions of the same data appear in different places.
Query & ad hoc query
A query is a specific request issued to the DBMS for data manipulation, e.g. to read or update the data. Simply put, a query is a question.
An ad hoc query is a spur-of-the moment question
Data quality
Data quality is a comprehensive approach to promoting the accuracy, validity and timeliness of the data. While the DMS does not guarantee data quality, it provides a framework to facilitate data quality initiatives.
Single-user database
A single-user database supports only one user at a time.
Desktop database
A single-user database that runs on a personal computer is called a desktop database.
Multiuser database
A multiuser database supports multiple users at the same time.
Workgroup database
When a multiuser database supports a relatively small number of users (usually less than 50), or a specific department within an organization, it is called a workgroup database.
Enterprise database
When a multiuser database is used by the entire organization and supports many users (more than 50, usually hundreds) across many departments, the database is known as an enterprise database.
Centralized database
A database that supports data located at a single site is called a centralized database.
Distributed database
A database that supports data distributed across several different sites is called a distributed database.
Cloud database
A cloud database is a database that is created and maintained using cloud data services, such as Microsoft Azure or Amazon’s AWS.
General-purpose databases
General-purpose databases contain a wide variety of data used in multiple disciplines, e.g. a census database that contains general demographic data.
Operational database / Online Transaction Processing (OLTP) database
A database that is designed to primarily support a company’s day-to-day operations is classified as an operational database, also known as an online transaction processing (OLTP), transactional, or production database.
Analytical database
An analytical database focuses primarily on storing historical data and business metrics used exclusively for tactical or strategic decision making. Typically, analytical databases comprise two main components: a data warehouse and an online analytical processing (OLAP) front end.
Data warehouse
The data warehouse is a specialized database that stores data in a format optimized for decision support.
Online analytical processing (OLAP)
Online analytical processing (OLAP) is a set of tools that work together to provide an advanced data analysis environment for retrieving, processing and modeling data from the data warehouse.
Business intelligence
The term business intelligence describes a comprehensive approach to capture and process business data with the purpose of generating information to support business decision making.
Unstructured data
Unstructured data are data that exist in their original (raw) state, i.e. in the format in which they were collected.
Structured data
Structured data are the result of formatting unstructured data to facilitate storage, use and the generation of information.
Semistructured data
Semistructured data have been processed to some extent.
Extensible Markup Language (XML)
Extensible Markup Language (XML) is a special language used to represent and manipulate data elements in a textual format. An XML database supports the storage and management of semistructured XML data.
NoSQL
The term NoSQL (Not only SQL) is generally used to describe a new generation of database management systems that is not based on the traditional relational database model. NoSQL databases are designed to handle the unprecedented volume of data, variety of data types and structures, and velocity of data operations that are characteristic of these new business requirements.
Database design
Database design refers to the activities that focus on the design of the database structure that will be used to store and manage end-user data.
Problems with file system data processing
Lengthy development times, difficulty of getting quick answers, complex system administration, lack of security and limited data sharing, extensive programming
Structural dependence
A file system exhibits structural dependence, which means that access to a file is dependent on its structure.
Structural independence
Structural independence exists when you can change the file structure without affecting the application’s ability to access the data.
Data redundancy
Data redundancy exists when the same data are stored unnecessarily at different places.
Uncontrolled data redundancy set the stage for the following?
Poor data security, data inconsistency, data anomalies
Data integrity
Data integrity is defined as the condition in which all of the data in the database are consistent with the real-world events and conditions. In other words, data integrity means that: data are accurate - there are no data inconsistencies and, data are verifiable - the data will always yield consistent results.
Data anomaly
A data anomaly develops when not all of the required changes in the redundant data are made successfully.
Database system
The term database system refers to an organization of components that define and regulate the collection, storage, management, and use of data within a database environment. The database system is composed of five major parts: hardware, software, people, procedures and data.
Query language
A query language is a nonprocedural language - one that lets the user specify what must be done without having to specify how. Structured Query Language (SQL) is the de facto query language and data access standard supported by the majority of DBMS vendors.